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NEW PLANT GENES 
FIELD OF THE INVENTION 

5 This invention relates to glutathione transferase (GST) subxinits, to nucleic acid 
sequences encoding glutathione transferase subunits^ and to uses of these glutathione 
transferases and coding sequences, especially in the field of plant biotechnology. 

BACKGROUND OF THE INVENTION 

10 

Glutathione transferases (GSTs, EC. 2.5.1.18), also referred to as glutathione 5- 
transferases, are multifunctional enzymes capable of catalysing the conjugation of 
electrophilic substrates with the tripeptide glutathione (GSH, gamma- 
glutamylcysteinylglycine). The electrophilic substrate may be of natural or synthetic 

15 origin, examples including endogenous stress-metabolites, drugs, pesticides and 
pollutants. Conjugation with GSH renders the compounds non-toxic and suitable for 
export from the cytosol and further metabolism. In addition to their activities in GSH 
conjugation, GSTs may have additional activities as glutathione peroxidases, catalysing 
the reduction of organic hydroperoxides to the corresponding alcohol according to the 

20 reaction: 

R-OOH + 2 GSH — -> R-OH + GSSG. 

All known active GSTs are composed of two polypeptide subunits, with each subunit 
25 possessing a binding site for GSH and the electrophilic co-substrate. The two subunits 
may either be identical, giving rise to a homodimer, or dissimilar giving rise to 
heterodimers. GSTs may therefore be defined according to their source, or class, and 
their component subunits according to the nomenclature SpGST x-y, where Sp = source 
or class of GST; x and y describe the subunit types. 

30 

Each discrete subunit is encoded by a distinct gene, with many eukaryotes containing 
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GST multigene families encoding multiple isoenzymes. 

The plant in which GSTs have been characterised in the greatest detail is maize (Zea 
mays L.). The major maize GSTs are composed of three discrete subimits, termed I, II 
5 and III. These subunits associate together to form three isoenzymes containing the Zea 
mays GST I subunit, namely ZwGSTI-I, Z/wGSTHI and ZiwGSTI-III as well as the 
homodimers ZmGSTII-II and ZwGSTIIHII. The nucleotide sequences of ZmGSTI, 
ZmGSTII and ZwGSTIII have been determined. In view of their relatedness in 
sequence, these maize GSTs have collectively been termed type I plant GSTs. 
10 Additional maize GSTs with activities toward herbicides have been described as 
Z/nGSTV-V and ZmGSTV-VI. The sequence of Z/wGSTV differs markedly from the 
other maize GSTs described to date, resembling the auxin-inducible GSTs from 
dicotyledenous plants which have been termed the type III GSTs. 

15 The maize GST subimit types are associated with differing substrate specificities. The 
ZmGSTI subunit has broad-ranging, but low, activities toward chloro-5-triazine, 
chloroacetanilide and diphenyl ether herbicides. The ZmGSTII and ZmGSTIII subunits 
show greater specificity toward chloroacetanilides, while ZwGSTV and ZmGSTVI are 
highly active toward diphenyl ethers. The GST isoenzymes differ in their patterns of 

20 expression in the organs of maize. Thus, Z/wGSTI-I and ZmGSTV-V are expressed in 
all plant parts, while ZwGSTI-II is root specific. The expression of the GST subunits is 
also differentially affected by herbicide safeners. These are compounds which enhance 
the tolerance of cereal crops to herbicides, in part, by increasing the expression of 
detoxifying enzymes such as GSTs. Thus, the ZmGSTII and ZmGSTV subunits 

25 accumulate in maize seedlings following treatment with the safeners dichlormid or 
benoxacor while the ZmGSTI and ZmGSTin subimits are only modestly enhanced by 
safeners. 

Far less is known regarding GSTs in plant species other than maize. GSTs with 
30 activities toward non-herbicide substrates have been identified in some plants, and 
mRNAs apparently encoding GSTs have been shown to be expressed in plants 
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including carnation, tobacco and thale cress [Arabidopsis thaliana). However, 
isoenzymes with activities toward herbicides have only been definitively identified in 
soybean, pea and pine trees. Of these, only in soybean has the nucleotide coding 
sequences of the herbicide-detoxifying GST been reported. 

5 

GSTs in plants have also been shown to have secondary activities as glutathione 
peroxidases, able to reduce organic hydroperoxides, such as fatty acid hydroperoxides 
to the corresponding monohydroxy alcohols. GSTs with glutathione peroxidase activity 
have been isolated from peas, soybean. A, thaliana and wheat flour. Since fatty acid 

10 hydroperoxides are a common result of membrane peroxidation imposed during 
oxidative stress, glutathione peroxidases provide an important cytoprotective function 
in preventing the accumulation of fatty acid hydroperoxides and their subsequent 
degradation to toxic aldehydes. Glutathione peroxidases may therefore have a vital 
function in protecting plant cells from oxidative stress. The intervention of glutathione 

15 peroxidases in lipid peroxidation has also been cited as a determinant of flour quality in 
wheat. 

Of particular relevance to this invention is the lack of knowledge concerning the GSTs 
of wheat (Triticum aestivum L.). 

20 

Some information is available from experiments on whole plants and plant extracts. 
Several herbicides including examples of the chloroacetanilides, as well as 
dimethenamid and fenoxaprop-ethyl undergo GSH conjugation in the course of their 
detoxification in wheat. Also, in crude plant extracts GST activities toward 
25 chloroacetanilide herbicides, dimethenamid and fenoxaprop-ethyl have been 
demonstrated. 

There have been very few reports of the purification of GSTs from wheat. A GST was 
purified from wheat flour, and described as a homodimer of 27.S kDa polypeptides with 
30 activity toward the non-herbicide substrate l-chloro-2,4-dinitrobenzene (CDNB) and 
glutathione peroxidase activity toward fatty acid hydroperoxides. A safener-induced 
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GST with activity toward CDNB and dimethenamid, termed GSTTbl-I, has been 
purified and partially sequenced from the wheat progenitor species Triticum 
tauschiiXR^ichtrs et al, (1997), Plant Physiology, m, pages 1461 to 1470). 

5 Moreover, very little is known regarding GST genes in wheat An mRNA originally 
described as wirS, which showed sequence similarity to the type 1 maize GSTs, was 
identified as accumulating in wheat leaves during the onset of acquired resistance to 
powdery mildew (Erysiphe graminis). The gene was termed gstAI and shown to be 
similar in genomic organisation to maize ZmGSTL The gstAI polypeptide was 

10 expressed in recombinant bacteria and shown to have an apparent molecular mass of 29 
kDa. The respective enzyme showed GST activity towards the non-herbicide CDNB, 
though the activity toward other substrates and activity as a glutathione peroxidase was 
not reported. An antibody was raised to the recombinant GstAI and used in Western 
blotting experiments to show that this GST was specifically induced in wheat leaves by 

15 pathogen attack. In contrast, a distinct class of GSTs composed of 25 kDa and 26kDa 
subunits, which were recognised by an antiserum raised to undefined GSTs in maize, 
accumulated following exposure to cadmium and the herbicides atrazine, alachlor and 
paraquat. The activities of these xenobiotic-inducible GSTs in wheat and the 
corresponding nucleotide sequences were not reported. A cDNA correponding to am 

20 mRNA encoding a safener-inducible type III GST has been isolated firom Triticum 
tauschii and had the same amino acid sequence as GSTTal-I, (Reicher et al, (1997), 
Plant Physiology, 114, page 1568). 

Thus, although wheat is an important orop plant, there has been little molecular 
25 characterisation of wheat GSTs or their genes and, to date, only two purified GSTs and 
two GST gene sequences, gstPA and GSTTal available. 

Significantly, neither purified recombinant GST proteins expressed ftom gene gstAI or 
GSTTal were reported to exhibit activity towards herbicides. Hence, none of the 
30 previous work on wheat GSTs actually provides any means of achieving herbicide 
resistance based on the function of wheat GSTs. 
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SUMMARY OF THE INVENTION 

We have purified four GST isoenzymes with activity toward herbicides firom wheat 
5 shoots treated with the herbicide safener fenchlorazole-ethyl and have identified four 
distinct subunits. In safener-treated shoots, we have found that the predominant GST 
subunit is a 25 kDa polypeptide, which has been termed Triticum aestivum GST 1 
(r^^GSTl). Additionally, two distinct 26 kDa subunits have been identified and termed 
7bGST2 and TaGSTS and a 24 kDa subunit, termed 7ViGST4. These subunits associate 
10 together to form the active dimeric isoenzymes TaGSTl-l, 7aGSTl-2, 7bGSTl-3 and 
7flGSTl-4. 

In our experiments, the expression of all four isoenzymes was affected by the herbicide 
safener fenchlorazble-ethyl, although each one responds in a somewhat different way. 

15 The TaGSTl-l isoenzyme is the major GST present in the leaves of untreated wheat 
seedlings, and its expression is increased by approximately 50% following exposure to 
fenchlorazole-ethyl. raGSTl-4 is expressed at low levels in untreated shoots and its 
expression is greatly increased by safener application, while raGSTl-2 and raGSTl-3 
are only observed following treatment with the safener. All four of these GST 

20 isoenzymes have broad-ranging activities toward xenobiotic substrates and all four 
demonstrate activity towards herbicides and additional activities as glutathione 
peroxidases able to reduce organic hydroperoxides, with raGSTl-4 being the most 
active in this respect. Each isoenzyme also has specific properties. Thus, for example, 
detoxification of one particular herbicide, fenox2q>rop-ethyl, is associated with the more 

25 strongly safener-inducible raGSTl-2, raGSTl-3 and 7ViGSTl-4 heterodimers, rather 
than with the TaGSTl-l homodimer. 

Furthermore, we have identified, cloned and sequenced cDNAs for the major type III 
GSTs in wheat, together with cDNAs encoding a range of type I GSTs, all active in 
30 herbicide metabolism. This is fundamental to understanding the GST detoxification 
system in wheat and to exploiting it to generate transgenic herbicide- resistant plants 

5 
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expressing wheat GSTs. In many previous studies, GST activity could not be linked to 
specific genes, precluding this approach. 

From the sequences of the cDNAs the amino acid sequences of the GST subunits 
5 themselves has been deduced. 

Accordingly, the invention provides: 

a polynucleotide encoding a glutathione transferase (GST) subunit, v^hich 
10 polynucleotide comprises a coding sequence capable of hybridising selectively to the 
coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 1 1, 13, 15 or 17 to the complement of one 
of those sequences. 

The invention also provides: 

15 

a polypeptide w^hich is a GST subunit and comprises the amino acid sequence of SEQ 
ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 18 or a sequence substantially homologous thereto, 
or a fragment of either said sequence. 

2 0 The invention also provides: 

a dimeric protein comprising tv^o GST subunits, wherein at least one subunit is a 
polypeptide of the invention. 

25 The invention also provides: 

a chimeric gene comprising a polynucleotide of the invention operably linked to 
regulatory sequences that allow expression of the coding sequence in a host cell 

3 0 The invention also provides: 
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a vector comprising a polynucleotide of the invention or a chimeric gene of the 
invention. 

The invention also provides: 

5 

a ceil transformed or transfected with a vector of the invention. 
The invention also provides: 
10 a cell having, integrated into its genome, a chimeric gene of the invention. 
The invention also provides: 

a process for the production of a polypeptide of the invention, which process 
15 comprises: 

(a) cultivating a cell of the invention under conditions that allow the expression of the 
polypeptide; and 

2 0 (b) recovering the expressed polypeptide. 

The invention also provides: 

a process for the production of a dimeric protein of the invention, which process 
25 comprises: 

(a) cultivating a cell of the invention under conditions that allow: 

(i) the expression of the polypeptide of the invention and, if a further polynucleotide 
sequence as defined herein is present, optionally the expression of a further GST 

30 subunit encoded by a further polynucleotide, and 

(ii) the association of the GST subunit polypeptide of the invention with another GST 
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subunit polypeptide to foim a dimeric protein of the invention; and 
(b) recovering the dimeric protein so formed. 
5 The invention also provides: 
a method of obtaining a transgenic plant cell comprising: 

(a) transforming a plant cell with an expression vector of the invention to give a 
10 transgenic plant ceil, 

and optionally, 

(a') transforming the cell with one or more further polynucleotide sequences 
15 coding for a GST subunit, operably linked to regulatory elements that allow expression 
of the subunit in the cell. 

The invention also provides: 

20 a method of obtaining a first-generation transgenic plant comprising: 

(b) regenerating a transgenic plant cell transformed with a vector of the invention 
to give a transgeiuc plant. 

25 The invention also provides: 

a method of obtaiiung a transgenic plant seed comprising: 

(c) obtaining a transgenic seed from a transgenic plant obtainable by regenerating 
30a transgenic plant cell transformed with a vector of the invention. 
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The invention also provides: 

a method of obtaining a transgenic progeny plant comprising obtaining a 
second-generation transgenic progeny plant from a first-generation transgenic plant 
5 obtainable by regenerating a transgenic plant cell transformed with a vector of the 
invention, and optionally obtaining transgenic plants of one or more further 
generations from the second-generation progeny plant thus obtained. 

The invention also provides: 

10 

a method of obtaining a transgenic progeny plant comprising obtaining a second- 
generation transgenic progeny plant from a first-generation transgenic plant obtainable 
by regenerating a transgenic plant cell transformed with a vector of the invention 
comprising: 

15 

(c) obtaining a transgenic seed from a first-generation transgenic plant obtainable 
by regenerating a transgenic plant cell transformed with a vector of the invention, then 
obtaining a second-generation transgenic progeny plant from the transgenic seed; 

20 

and/or 

(d) propagating clonally a first-generation transgenic plant obtainable by 
regenerating a transgenic plant cell transfonned with a vector of the invention to give a 

25 second-generation progeny plant; 

and/or 

(e) crossing a first-generation transgenic plant obtainable by regenerating a 
30 transgenic plant cell transformed with a vector of the invention vnih another plant to 

give a second-generation progeny plant; 
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and optionally ; 

(f) obtaining transgenic progeny plants of one or more further generations from 
5 the second-generation progeny plant thus obtained. 

The invention also provides: 

a transgenic plant cell, first-generation plant, plant seed or progeny plant obtainable by 
10 a method of the invention. 

The invention also provides: 

a transgenic plant or plant seed comprising plant cells of the invention. 

15 

The invention also provides: 

a transgenic plant cell callus comprising plant cells of the invention, or obtainable from 
a transgenic plant cell, first-generation plant, plant seed or progeny plant of the 
20 invention. 

The invention also provides: 

use of a polynucleotide of the invention as a selectable marker for detecting 
25 transformation of a plant cell. 

The invention also provides: 

a nucleic acid construct comprising: 

30 

(a) a polynucleotide of the invention operably linked to regulatory elements that 

10 
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allow expression of the coding sequence in a plant cell; and 

(b) a site into which a further polynucleotide comprising a coding sequence can be 
inserted. 

5 

The invention also provides: 

a vector comprising such a construct. 

10 The invention also provides: 

a method of transforming a plant cell or of obtaining a plant cell culture or transgenic 
plant comprising: 

15 (a) providing an untransformed plant cell which is susceptible to a herbicide 
whose herbicidal activity is reduced by a dimeric protein of the invention; 

(b) transforming the plant cell with a vector comprising: 

20 (i) a polynucleotide of the invention operably linked to regulatory elements that allow 
expression of the coding sequence in a plant cell; and 

(ii) a site mto which a further polynucleotide comprising a coding sequence can be 
inserted; 

25 

(c) cultivating the transformed cell under conditions that allow the expression of 
the polynucleotide (a) in the construct; and/or 

(c*) regenerating the cell to give a cell culture or plant such that the polynucleotide 
3 0 (a) in the construct is expressed; and 
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(d) contacting the cell, cell culture or plant with the herbicide whose herbicidal 
activity is reduced by the dimeric protein of the invention, and to which the 
untransformed plant cell was susceptible; and 

5 (e) selecting cells, cell cultures or plants that are less susceptible to the herbicide 
than are corresponding untransformed cells, cell cultures or plants. 

The invention also provides: 

10 use of a dimeric protein of the invention in a method of identifying compounds capable 
of metabolism by a GST. 

The invention also provides: 

15 a method of identifying compounds capable of being metabolised by a glutathione 
transferase comprising: 

(a) contacting a candidate compound suspected of being capable of being 
metabolised by glutathione transferase with glutathione (GSH) in the presence of a 

2 0 dimeric protein of the invention; and 

(b) determining whether or not metabolism of the candidate compound takes 
place. 

25 The invention also provides: 

compounds identified by such methods. 
The invention also provides: 

30 

a kit for detecting compounds capable of being metabolised by a GST 
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comprising: 

(a) reduced glutathione, hydroxymethylglutathione or 
homoglutathione; 

5 

and 

(b) a dimeric protein of the invention. 

10 The invention also provides: 

an antibody which specifically recognises a polypeptide or dimeric protein of the 
invention. 

15 The invention also provides: 

a nucleic acid probe which selectively hybridises to the sequence of SEQ ID No. 1, 3, 
5, 7, 9, 11,13, 15 or 17. 

20 The invention also provides: 

a method of identifying compounds that induce GST expression in graminaceous 
plants comprising: 

25 (a) contacting a graminaceous plant, or a cell or cell culture thereof, with a 
candidate compound suspected of being capable of inducing GST expression; and 

(b) determining the level of GST expression in the plant, cell or cell culture. 

30 The invention also provides: 
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compounds identified by such methods. 

The invention also provides: 

5 a kit for identifying compounds that induce GST expression in plants by such a 
method, which kit comprises an antibody of the invention. 

The invention also provides: 

10 a method of determining the GST level in a sample of seed or flour comprising: 

(i) determining the level of GST protein present by using an antibody 
of the invention; or 

15 (ii) determining the level of GST mRNA present using a probe of the 

invention. 

The invention also provides: 

20 a method of controlling the growth of weeds at a locus where a transgenic plant of the 
invention is being cultivated, which method comprises applying to the locus a 
herbicide whose herbicidal properties are reduced by a dimeric protein of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

25 

Figure 1. Anion-exchange chromatography of affinity-purified wheat GSTs. 

Chromatography of A: affinity-purified polar GSTs; and B: aSinity-purified 
hydrophobic GSTs on Hi-Trap Q-Sepharose columns eluted with the increasing NaCl 
30 gradient shown. The eluent was monitored for A280 as shown with the unbroken line 
and individual fractions assayed for GST activity. 

14 
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Figure 2. HPLC analysis of wheat GST subunits. 

Reversed-phase HPLC analysis of polypeptide subunits present in A, affinity-purified 
5 polar GSTs; B, affinity-purified hydrophobic GSTs; C, the isoenzyme rorGSTl-l, 
resolved by anion-exchange chromatography of the afiBnity-purified polar GSTs. 

DETAILED DESCRIPTION OF THE INVENTION 

10 Polynucleotides 

The invention provides polynucleotides comprising sequences encoding novel GST 
subunits, SEQ ID Nos 1, 3, 5, 7, 9, 11, 13, 15 and 17 and sequences that hybridise 
selectively to these coding sequences thereof or their complementary sequences. It also 
15 provides polynucleotide fragments of these sequences that encode polypeptides having 
GST activity, as defined herein. 

A polynucleotide of the invention is capable of hybridising selectively with the coding 
sequence of SEQ ID No. 1, 3, 5, 7, 9, 11, 13, 15 or 17 or to the sequence 

20 complementary to one of those coding sequences. Polynucleotides of the invention 
include variants of the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 11, 13, 15 or 17 
v^hich can function as GSTs, when dimerised with another GST subunit. Typically, a 
polynucleotide of the invention is a contiguous sequence of nucleotides which is 
capable of selectively hybridising to the coding sequence of SEQ ID. No. 1, 3, 5, 7, 9, 

25 11, 13, 15 or 17 or to the complement of that coding sequence. 

A polynucleotide of the invention can hybridise to coding sequence of SEQ ID No. 1, 
3, 5, 7, 9, 11, 13, 15 or 17 at a level significantly above backgroimd. Background 
hybridisation may occur, for example, because of other cDNAs present in a cDNA 
30 library. The signal level generated by the interaction between a polynucleotide of the 
invention and the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 11, 13, 15 or 17 is 
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typically at least 10 fold, preferably at least 100 fold, as intense as interactions between 
other polynucleotides and the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 11, 13, 15 
or 17. The intensity of interaction may be measured, for example, by radiolabelling the 
probe, e.g. with ^^P. Selective hybridisation is typically achieved using conditions of 
5 medium to high stringency (for example 0.03M sodium chloride and 0.03M sodium 
citrate at from about 50°C to about 60°C). 

A nucleotide sequence capable of selectively hybridising to the DNA coding sequence 
of SEQ ID No, 1, 3, 5, 7, 9, 11, 13, 15 or 17 or to the sequence complementary to one 
1 0 of those coding sequences will be generally at least 70%, preferably at least 80 or 90% 
and more preferably at least 95%, 98% or 99%, homologous to the coding sequence of 
SEQ ID No. 1, 3, 5, 7, 9, 1 1, 13, 15 or 17 or the complement of one of those sequences 
over a region of at least 20, preferably at least 30, for instance at least 40, 60 or 100 or 
more contiguous nucleotides. 

15 

Any combination of the above mentioned degrees of homology and minimum sizes 
may be used to define polynucleotides of the invention, with the more stringent 
combinations (i.e. higher homology over longer lengths) being preferred. Thus for 
example a polynucleotide which is at least 90% homologous over 25, preferably over 
20 30 nucleotides forms one aspect of the invention, as does a polynucleotide which is at 
least 95% homologous over 40 nucleotides. 

Polynucleotides of the invention may comprise DNA or RNA. They may also be 
polynucleotides which include within them synthetic or modified nucleotides. A 

25 number of different types of modification to polynucleotides are known in the art 
These include methylphosphonate and phosphorothioate backbones, addition of 
acridine or polylysine chains at the 3' and/or 5' ends of the molecule. For the purposes 
of the present invention, it is to be understood that the polynucleotides described herein 
may be modified by any method available in the art. Such modifications may be 

3 0 carried out in order to enhance the in vivo activity or lifespan of polynucleotides of the 
invention. 
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Polynucleotides of the invention may be used to produce a primer, e.g. a PCR primer, a 
primer for an alternative amplification reaction, a probe e.g. labelled with a revealing 
label by conventional means using radioactive or non-radioactive labels, or the 
5 polynucleotides may be cloned into vectors. Such primers, probes and other fragments 
will preferably be at least 10, preferably at least 15 or 20, for example at least 25, 30 or 
40 nucleotides in length. 

Polynucleotides such as a DNA polynucleotide and primers according to the invention 
10 may be produced recombinantly, synthetically, or by any means available to those of 
skill in the art. They may also be cloned by standard techniques. The polynucleotides 
are typically provided in isolated and/or purified form. 

In general, primers will be produced by synthetic means, involving a stepwise 
15 manufacture of the desired nucleic acid sequence one nucleotide at a time. Techniques 
for accomplishing this using automated techniques are readily available in the art. 

Genomic clones conespondmg to the cDNAs of SEQ ID No. 1, 3, 5, 7, 9, 1 1, 13, 15 
and 17 containing, for example introns and promoter regions are also aspects of the 
20 invention and may also be produced using recombinant means, for example using PCR 
(polymerase chain reaction) cloning techniques, starting with genomic DNA from a 
wheat {Triticum aestivum L), cell, e.g. a wheat shoot cell or a cell of a plant of a 
related Triticum species, for example as described by Feldman et al,^ (Scientific 
American, (1981), vol. 244(1) pages 98 to 109). 

25 

Although in general the techniques mentioned herein are well known in the art, 
reference may be made in particular to Sambrook et a/, 1989, Molecular Cloning: a 
laboratory manual. 

30 Polynucleotides ^^ch are not 100% homologous to the sequences of the present 
invention but fall within the scope of the invention can be obtained in a number of 
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ways. 

other allelic variants of the wheat sequences of SEQ ID Nos. 1, 3, 5, 7, 9, 11, 13, 15 
and 17 including those from Triticum aestivum L species itself related to Triticum 
5 aestivum L (cf Feldman et al supra) may be obtamed for example by probing 
genomic DNA libraries made from a range of wheat cells, using probes as described 
above. 

In addition, other plant homologues of SEQ ID Nos. 1, 3, 5, 7, 9, 11, 13, 15 and 17 
10 may be obtained and such homologues and fragments thereof in general will be 
capable of selectively hybridising to the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 
1 1, 13, 15 or 17 or its complement. Such sequences may be obtained by probing cDNA 
or genomic libraries from other plant species with probes as described above. 
Degenerate probes can be prepared by means known in the art to take into account the 
15 possibility of degenerate variation between the DNA sequences of SEQ ID Nos. 1, 3, 5, 
7, 9, 1 1, 13, 15 and 17 and the sequences being probed for under conditions of medium 
to high strmgency (for exanaple 0.03M sodium chloride and 0.03M sodium citrate at 
from about 50°C to about 60«C). 

20 Allelic variants and species homologues may also be obtained using degenerate PGR 
which will use primers designed to target sequences withm the variants and 
homologues encoding likely conserved amino acid sequences. Likely conserved 
sequences can be predicted from aligning the amino acid sequences of the invention 
(SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 and 18) with that of other similar GST subunit 

25 sequences. The primers will contain one or more degenerate positions and will be used 
at stringency conditions lower than those used for cloning sequences with single 
sequence primers against known sequences. 

Alternatively, such polynucleotides may be obtained by site directed mutagenesis of 
30 SEQ ID Nos. 1, 3, 5, 7, 9, 11, 13, 15 or 17 sequences or allelic variants thereof This 
may be useftil where, for example, silent codon changes are required to sequences to 
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optimise codon preferences for a particular host cell in which the polynucleotide 
sequences are being expressed. Other sequence changes may be desired in order to 
introduce restriction enzyme recognition sites, or to alter the property or function of the 
polypeptides encoded by the polynucleotides. 

5 

The invention further provides double stranded polynucleotides comprising a 
polynucleotide of the invention and its complement. 

Polynucleotides, probes or primers of the invention may carry a revealing label. 
10 Suitable labels include radioisotopes such as ^^P or ^^S, enzyme labels, or other protein 
labels such as biotin. Such labels may be added to polynucleotides, probes or primers 
of the invention and may be detected using techniques known per se. 

The present invention also provides polynucleotides encoding the polypeptides of the 
15 mvention described below. Because such polynucleotides will be usefiil as sequences 
for recombinant production of polypeptides of the invention, it is not necessary for 
them to be selectively hybridisable to the coding sequence of sequence SEQ ID Nos. 1, 
3, 5, 7, 9, 11, 13, 15 or 17 although this will generally be desirable. Otherwise, such 
polynucleotides may be labelled, used, and made as described above if desired. 
2 0 Polypeptides of the invention are described below. 

Particularly preferred polynucleotides of the invention are those of SEQ ID No. 1, 3, 5, 
7, 9, 11, 13, IS or 17 and the polynucleotides that are the coding regions within those 
sequences i.e. the regions which encode the polypeptides of SEQ ID No. 2, 4, 6, 8, 10, 
25 12, 14, 16 or 18. 

Polvpeptides 

A polypeptide of the invention consists essentially of the amino acid sequence set out 
30 in SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 18 or a substantially homologous sequence, 
or of a fragment of either of these sequences. In general, the naturally occurring amino 
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acid sequences shown in SEQ ID Nos. 2, 4, 6, 8, 10, 12, 14, 16 or 18 are preferred. 
However, the polypeptides of the invention include homologues of the natural 
sequences, and fragments of the natural sequences and of their homologues, which 
have GST activity. 

5 

The polypeptides of the invention are glutathione transferase (GST) subunits. The 
invention also provides dimeric proteins comprising two GST subunits wherein at least 
one subimit is a polypeptide of the invention. 

10 Thus, the polypeptides of the invention are normally functionally active as GSTs when 
dimerised with another GST subunit. Thus, dimeric proteins of the invention are 
capable of catalysing the conjugation of the tripeptide glutathione (GSH, ganuna- 
glutamylcysteinyl glycine) and/or related derivatives to an electrophilic substrate of 
natural or synthetic origin. Related derivatives include homoglutathione (gamma- 

15 glutamylcysteinyl alanine) and hydroxymethylglutathione (ganmia-glutamylcysteinyl 
serine). 

Optionally, they may also have one or more of the other properties of naturally 
occurring GSTs including glutathione peroxidase activity as defined above. 

20 

Preferably, they have GST activity towards one or more herbicide substrates. For 
example, they may have activity towards one or more of the following herbicides: 
Fluorodifen, Fenoxaprop-ethyl, Metolachlor, Alpha-Metolachlor, Acetochlor, 
Alachlor, Pretilachlor, Fluthiamid, Dimethenamid, iS'-Dimethenamid, Flupyrsulfiiron- 
25 methyl, Triflusulfuron-methyl, Acifluorfen, Chlorimuron-ethyU Fomesafen, Atrazine, 
Suna^e, Cyanazine and the sulphatide metabolite of Metribuzin. Particularly 
preferred herbicides include Fenoxaprop-ethyl, Flupyrsulfiiron-methyl, Fluthiamid, 
Acetochlor, Metolachlor and Alpha-Metolachlor. 

30 Most preferably, a dimeric protein of the invention is able to catalyse the conjugation 
of GSH to one or more of the following herbicide substrates: Fenoxaprop-ethyl, 
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Flupyrsulfuron-methyl, fluthiamid, Acetochlor, Metolachior and Alpha-Metolachlor. 

Optionally, a dimeric protein of the invention may be able to catalyse the conjugation 
of GSH to one or more non-herbicide substrates, for example CDNB. They may also 
5 have activity towards phytotoxic non-herbicide substrates. 

Optionally, monomeric polypeptides of the invention may have GST activity as 
defined above, even vAicn not dimerised. 

10 In particular, a polypeptide of the invention may comprise: 

(a) the polypeptide sequence of SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 18; 

(b) an allelic variant or species homologue thereof; or 

(c) a protein at least 70 80, 90, 95, 98 or 99% homologous to (a) or (b). 

15 

An allelic variant v^U be a variant which will occur naturally in a plant and which will 
function in a substantially similar manner to the protein of SEQ ID No. 2, 4, 6, 8, 10, 
12, 14, 16 or 18, as defined above. Similarly, a species homologue of the protein will 
be the equivalent protein which occurs naturally in another plant species which can 
20 function as GST. Such a homologue may occur in plants other than wheat, particularly 
monocotyledonous plants such as related Triticum species, rice, maize, oats, rye, 
barley, triticale or sorghum. Within any one species, a homologue may exist as several 
allelic variants, and these will all be considered homologues of the protein of SEQ ID 
No. 2, 4, 6, 8, 10, 12, 14, 16 or 18. 

25 

Allelic variants and species homologues can be obtained by following the procedures 
described herein for the production of the polypeptides of SEQ ID No. 2, 4, 6, 8, 10, 
12, 14, 16 and 18 and performing such procedures on a suitable cell source e.g. a cell 
of a wheat genotype carrying an allelic variant, or a cell of a plant of a different another 
30 species. It will also be possible to use a probe as defined above nucleotide sequence to 
probe libraries made from plant cells in order to obtain clones encoding the allelic or 
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species variants. The clones can be manipulated by conventional techniques to generate 
a polypeptide of the invention which can then be produced by recombinant or synthetic 
techniques known per se, 

5 A polypeptide of the invention is preferably at least 70% homologous to the protem of 
SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 18, more preferably at least 80 or 90% and 
more preferably still at least 95%, 97% or 99% homologous thereto over a region of at 
least 20, preferably at least 30, for instance at least 40, 60 or 100 or more contiguous 
amino acids. Methods of measuring protein homology are well known in the art and it 
10 will be understood by those of skill in the art that in the present context, homology is 
calculated on the basis of amino acid identity (sometimes referred to as "hard 
homology"). 

The sequence of the polypeptides of SEQ ID Nos 2, 4, 6, 8, 10, 12, 14, 16 and 18 and 
15 of allelic variants and species homologues can thus be modified to provide 
polypeptides of the invention. 

Amino acid substitutions may be made, for example firom 1, 2 or 3 to 10, 20 or 30 
substitutions. The modified polypeptide generally retains activity as a GST, as defined 
20 herein. Conservative substitutions may be made, for example according to the 
following Table. Amino acids in the same block in the second column and preferably 
in the same line in the third column may be substituted for each other. 



ALIPHATIC 



Non-polar 



Polar-uncharged 



Polar-charged 



GAP 
ILV 



CS^M 
"NQ 



DE 



AROMATIC 



HFWY 
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Polypeptides of the invention also include fragments of the above-mentioned full 
length polypeptides and variants thereof, including fragments of the sequence set out in 
SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 and 18. Such fragments typically retain activity as 
5 a GST. 

Other preferred fragments include those which include an epitope. Suitable fragments 
will be at least about 5, e.g. 10, 12, 15 or 20 amino acids in size. Polypeptide fragments 
of the polypeptides of SEQ IDNos. 2, 4, 6, 8, 10, 12, 14, 16 and 18, and allelic and 
10 species variants thereof may contain one or more (e.g. 2, 3, 5, or 10) substitutions, 
deletions or insertions, including conserved substitutions. Epitopes may be determined 
either by techniques such as peptide scanning techniques already known in the art. 
These fragments will be useful for obtaining antibodies to polypeptideis and dimeric 
proteins of the invention. 

15 

Polypeptides of the invention may be in a substantially isolated form. It will be 
understood that the polypeptide may be mixed with carriers or diluents which will not 
interfere with the intended purpose of the polypeptide and still be regarded as 
substantially isolated. A polypeptide of the invention may also be in a substantially 
20 purified form, in which case it will generally comprise the polypeptide in a preparation 
in which more than 90%, e.g. 95%, 98% or 99% of the polypeptide in the preparation 
is a polypeptide of the invention. 

Polypeptides of the invention may be modified for example by the addition of 
25 Histidine residues or a T7 tag to assist their identification or purification or by the 
addition of a signal sequence to promote their secretion from a cell. 

A polypeptide of the invention may be labelled with a revealing label. The revealing 
label may be any suitable label which allows the polypeptide to be detected. Suitable 
30 labels include radioisotopes, e.g. ^^^I, ^^S, en2ymes, antibodies, polynucleotides and 
linkers such as biotin. 
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Polypeptides and dimeric proteins of the invention may be chemically modified, e.g. 
post-translationally modified. For example, they may be glycosylated or comprise 
modified amino acid residues. Such modified polypeptides and proteins fall within the 
5 scope of the terms "polypeptide" and "dimeric protein" of the invention. 

Dimeric proteins 

The invention also provides dimeric proteins having two GST subunits wherein at least 
10 one of the two subunits is a polypeptide of the invention. These dimeric proteins may 
have two identical subxmits of the invention, i.e. they may be homodimeric. 
Alternatively, they may have two dissimilar subunits; i.e. they may be heterodimeric. 

In heterodimers, the two subunits may both be polypeptides of the invention. 
15 Alternatively, one subunit may be a polypeptide of the invention, whilst the other is a 
different GST subunit. 

Thus, for example, heterodimeric proteins of the invention may have one subunit 
which is a polypeptide of the invention, and one which is a known GST subunit firom 
20 maize (e.g. ZmGSTI, ZmGSTII, ZmGSTIII, ZmGSTIV, ZmGSTV or ZmGSTVI: see 
above), or another species. 

Preferably, the dimeric proteins have two subunits that are polypeptides of the 
invention. Various combinations of polypeptides of the invention are possible. 
25 Preferred combinations include: 

7bGSTl-l (SEQ ID No. 2/SEQ ID No. 2); 
rflGSTl-2 (SEQ ID No. 2/SEQ ID No. 16); 
TaGSTU3 (SEQ ID No. 2/SEQ ID No. 18); 

30 

being representative of the major combinations found in GSTs in safener-treated 
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wheat 

The invention also provides dimeric proteins having two subunits as described above 
which are fusion proteins. In these fusion proteins, the two subunits are joined by a 
5 linker polypeptide. Any linker may be used as long as it does not interfere significantly 
with the correct association of the two subunits or with the GST activity of the dimer. 
Such fusion proteins will typically be prepared by joining together the polynucleotides 
encoding the two monomers in the correct reading frame, then expressing the 
composite polynucleotide coding sequence under the control of regulatory sequences 
10 as defined herein. These composite polynucleotide coding sequences are a further 
aspect of the invention, as are chimeric genes and vectors comprising them, methods of 
producing them by recombinant means, and cells and plants comprising such vectors or 
chimeric genes. It will be understood that dimeric proteins of the invention may be 
such fusion proteins. 

15 

Vectors and chimeric genes 

Polynucleotides of the invention can be incorporated into a recombinant replicable 
vector. The vector may be used to replicate the nucleic acid in a compatible host cell. 

20 Thus in a further embodiment, the invention provides a method of making 
polynucleotides of the invention by introducing a polynucleotide of the invention into a 
replicable vector, introducing the vector into a compatible host cell, and cultivating the 
host cell under conditions which bring about replication of the vector. The vector may 
be recovered from the host cell. Suitable host cells are described below in connection 

25 with expression vectors. Bacterial cells, especially E. Coli are preferred. 

Expression vectors 

Preferably, a polynucleotide of the invention m a vector is operably linked to 
30 regulatory sequences capable of effecting the expression of the coding sequence by the 
host cell, i.e. the vector is an expression vector. Such expression vectors can be used to 
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express the polypeptides of the invention. 

The term "operably linked" refers to a juxtaposition wherein the components described 
are in a relationship permitting them to function in their intended manner. A regulatory 
5 sequence "operably linked" to a coding sequence is positioned in such a way that 
expression of the coding sequence is achieved xmder conditions compatible with the 
regulatory sequences. 

Such vectors may be introduced into a suitable host cell to provide for expression of a 
10 polypeptide or polypeptide fragment of the invention, as described below. 

The vectors may be for example, plasmid, virus or phage vectors provided with an 
origin of replication, preferably a promoter for the expression of the said 
polynucleotide and optionally an enhancer and/or a regulator of the promoter. For 

15 expression in plant cells, one preferred enhancer is the Tobacco etch virus (TEV) 
enhancer. A terminator sequence may also be present, as may a polyadenylation 
sequence. The vectors may contain one or more selectable marker genes, for example 
an ampicillin resistance gene in the case of a bacterial plasmid or a neomycin 
resistance gene (e.g. npti or nptll) or methotrexate resistance gene for a plant vector. 

20 Vectors may be used in vitro ^ for example for the production of RNA or used to 
transfect or transform a host cell. The vector may also be adapted to be used in vivo, for 
example for generation of transgenic plants of the invention. 

So far as plasmid vectors are concerned, plasmids derived from the Ti plasmid of 
25 Agrobacterium tumefaciem are especially preferred, as are plasmids derived from the 
Ri plasmid of Agrobacterium rhizogenes, 

A frirther embodiment of the invention provides host cells transformed or transfected 
with the vectors for the replication and expression of polynucleotides of the invention. 
30 The cells will be chosen to be compatible with the said vector and may for example be 
prokaryotic (bacterial), plant, yeast, insect or mammalian cells, bacterial and plant cells 
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being preferred. 

Polynucleotides according to the invention may also be inserted into the vectors 
described above in an antisense orientation in order to provide for the production of 
5 antisense RNA. Antisense RNA or other antisense polynucleotides may also be 
produced by synthetic means. Such antisense polynucleotides may be used in a method 
of controlling the levels of GSTs having the sequence of SEQ ID No. 2, 4, 6, 8, 10, 12, 
14, 16 or 1 8 or their variants or species homologues in planta. 

10 Promoters and other regulatory elements may be selected to be compatible with the 
host cell for which the expression vector is designed. 

Promoters suitable for use in plant cells may be derived, for example, from plants or 
from bacteria that associate with plants or from plant viruses, thus, promoters from 
15 Agrobacteritim spp. including the nopaline synthase (nos), octopine synthase (ocs) and 
mannopine synthase (mas) promoters are preferred. Also preferred are plant promoters 
such as the ribulose bisphosphate small subunit promoter (rubisco ssu), and the 
phaseolin. promoter. Also preferred are plant viral promoters such as the cauliflower 
mosaic virus (CAMV) 35S and 19S promoters. 

20 

Depending on the pattern of expression desired, promoters may be constitutive or 
inducible. For example, strong constitutive expression in plants can be obtained with 
the CAMV 35S or rubisco ssu promoters. Also, tissue-specific or stage-specific 
promoters may be used to target expression of polypeptides of the invention to 
25 particular tissues in a transgenic plant or to particular stages in its development. 
Chemically inducible promoters such as those activated by herbicide safeners may also 
be used,for example the maize GST 27 promoter (W097/11I89), the maize In2-1 
prompter (WO90/11361), the maize In2-2 promoter (De Veylder et al, (1997), Plant 
Cell Physiology, Vol. 38, pages 568 to 577. 

30 

Especially where expression in plant cells is desired, other regulatory signals may also 
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Expression in the host cell may be transient although, preferably, integration of the 
polynucleotide or chimeric gene of the invention into the ceirs genome is achieved. 

5 Suitable cells include cells in which the above-mentioned vectors may be expressed. 
These include microbial cells such as bacteria such as E. colU plant cells, manmialian 
cells such as CHO cells, C0S7 cells or Hela cells, insect cells or yeast such as 
Saccharomyces, Bacterial and plant cells are preferred. 

10 Optionally, cells of the invention may comprise one or more further polynucleotide 
sequences encoding a GST subunit, operably linked to regulatory sequences, as defined 
above, that allow expression of the subunit in the cell. Such polynucleotide sequences 
may be further polynucleotides of the invention or they may encode other GST 
subunits as defined above with respect to dimeric proteins. 

15 

Such polynucleotides may be naturally present in the cell, e.g. if it is a plant cell or 
they may be introduced artificially, e.g. as defined above. 

Such cells allow the production of heterodimeric proteins of the invention where the 
20 polynucleotides encode different GST subunits, or the production of monomeric 
polypeptides of the invention and/or homodimeric proteins of the invention in greater 
quantities. For example, they may allow the expression of active heterodimeric 
enzymes. 

25 Cell culture will take place under standard conditions. Commercially available cultural 
media for cell culture are widely available and can be used in accordance with 
manufacturers' instructions. 

Processes for production of polvpeotides and dimeric proteins 

30 

The invention provides processes for the production of polypeptides and dimeric 
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proteins of the invention by recombinant means. 

Generally, monomeric GST subunits of the invention spontaneously dimerise to form 
homodimers and/or heterodimers of the invention. Thus, in general, expression of 
5 polypeptides of the invention gives rise to dimers in the first instance. These dimers 
may be the desired product; alternatively, it may be desirable to separate the 
monomers. For example, as described below, it may be desired to separate the 
monomeric subunits of a homodimer in order to combine them with different 
monomeric subunits, thereby yielding heterodimers. 

10 

Processes for the production of polypeptides of the invention may comprise: 

(a) cultivating a transformed cell as defined above vmder conditions that allow the 
expression of the polypeptide; 

15 

and preferably 

(b) recovering the expressed polypeptide. 

20 For example, the expressed monomeric peptides may be recovered by denaturation of 
dimers formed by them, which separates the subunits. Then, the monomers can be 
recovered and renatured. Typically, they will then redimerise. 

Processes for production of dimeric proteins of the invention may comprise: 

25 

(a) cultivating a transfonned cell as defined above under conditions that allow 

(i) the expression of the polypeptide of the invention and, if a fiirther GST subunit- 
encoding sequence as defined above is present, optionally the expression of a further 
3 0 GST subunit encoded by the further sequence 
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(ii) the association of the GST 

subunit polypeptide of the invention with another identical GST subunit polypeptide to 
5 form a home dimeric protein of the invention; and/or 

(ii) the association of the GST subunit polypeptide of the invention with a non- 
identical GST subunit to form a heterodimeric protein of the invention. 

10 

and preferably 

(b) recovering the dimeric proteins so formed, and optionally resolving them. 

15 Where only a single type of GST subunit-encoding sequence of the mvention is present 
in the transformed cell, these processes normally give rise to homodimeric proteins of 
the invention. Where one or more further GST subimit-encoding sequences is present, 
these processes give rise to heterodimers or to a mixture of some or all of the 
following: homodimers of each possible ^e. 

20 

Alternatively, dimeric proteins of the invention can be produced by expressing the 
required polypeptide subunits in separate cells. This typically leads to the production of 
two different types of homodimer. The desired heterodimer can then be prepared by: 
mixing the homodimers and denaturing the mixed sample, or by denaturing the 

25 homodimers separately and then mixing them; then renaturing the mixed sample. This 
will typically lead to a mixture of dimeric proteins comprising both possible types of 
homodimers and also het^dimers comprising one subunit of each type. Similarly, 
mixtures of greater nimibers of types of dimer can be produced in this way if different 
homodimers are produced in three or more different cells, or if cells that give rise to 

30 heterodimers are used. 
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For these processes, any transformed cell as described above may be used. Bacterial 
cells are preferred, especially cells of E. colU although other cell types may also be 
used. 

5 Optionally, the polypeptide or dimeric protein may be isolated and/or purified, by 
techniques known in the art. 

In processes of the invention, any suitable method may be used to denature and/or 
renature polypeptides of the invention, and suitable methods are well known m the art. 

10 

Similarly, where a mixture of polypeptide subunits or dimeric proteins results, these 
may be resolved or separated by any suitable technique known in the art* 

Antibodies 

15 

The invention also provides monoclonal or polyclonal antibodies which specifically 
recognise polypeptides of the invention or dimeric proteins of the invention. 

Thus, antibodies of the invention bind specifically to the polypeptides and/or dimers of 
20 the invention, preferably to the extent that they distinguish between the polypeptides 
and/or dimers of the invention and other GST subunits and GSTs. 

Monoclonal antibodies may be prepared by conventional hybridoma technology using 
polypeptides or dimeric proteins of the invention as inomunogens. Polyclonal 

25 antibodies may also be prepared by conventional means vdiich comprise inoculating a 
host animal, for example a rat or a rabbit, with a polypeptide of the invention and 
recovering immune serum. In order that such antibodies may be made, polypeptides 
may be haptenised to another polypeptide for use as immunogens in animals or 
humans. For the purposes of this invention, the term "antibody" includes antibody 

3 0 fragments such as Fv, F(ab) and F(ab)2 fragments, as well as single chain antibodies. 
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Methods of producing transgenic plant cells, plant parts and tissues, plants and seeds of 
the invention 

Transgenic plant cells, plant parts and tissues, plants and seeds of the invention are 
5 transgenic in the sense that they have at least one polynucleotide of the invention 
introduced into them. 

The invention provides a method of obtaining a transgenic plant cell comprising 
transfonning a plant cell with an expression vector of the invention to give a transgenic 
10 plant cell; and optionally transforming the cell with one or more further polynucleotide 
sequences coding for a GST subunit, operably linked to regulatory elements that allow 
expression of the subunit in the celL(As discussed above, this allows the production of 
heterodimeric GST dimers of the invention, or the production of homodimeric ones of 
the invention in greater quantities.) 

15 

Any suitable transformation method may be used, for example the transformation 
techniques described herein. Preferred transformation techniques include 
electroporation of plant protoplasts, transformation by Agrobacterium tumefaciens and 
particle bombardment. Particle bombardment is particularly preferred for 
2 0 transformation of monocot cells. 

The cell may be in any form, for example, it may be an isolated cell, e.g. a protoplast, 
or it may be part of a plant tissue, e.g. a callus, or a tissue excised from a plant, or it 
may be part of a whole plant. Transformation may thus give rise to a chimeric tissue or 

2 5 plant in \^ch some cells are transgenic and some are not. 

Preferably, integration of a polynucleotide or chimeric gene of the invention into the 
cell's genome is achieved. 

3 0 The thus obtained cell may be regenerated into a transgenic plant by techniques known 

in the art. These may involve the use of plant growth substances such as auxins, 
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giberellins and/or cytokinins to stimulate the growth and/or division of the transgenic 
cell. Similarly, techniques such as somatic embryogenesis and meristem culture may 
be used. 

5 In many such techniques, one step is the formation of a callus, i.e. a plant tissue 
comprising expanding and/or dividing ceils. Such calli are a further aspect of the 
invention as are other types of plant cell cultures and plant parts. Thus, for example, 
the invention provides transgenic plant tissues and parts, including embryos, 
meristems, seeds, shoots, roots, stems, leaves and flower parts. These may be chimeric 
10 in the sense that some of their cells are transgenic and some are not. 

Regeneration procedures will typically involve the selection of transformed cells by 
means of marker genes. Some marker genes have already been mentioned and it should 
also be noted that the polynucleotides of the invention can themselves act as marker 
15 genes if they are under the control of regulatory sequences that allow their expression 
during the appropriate stage of the regeneration procedure. The polypeptides of the 
invention are capable of conferring resistance to herbicides or other phytotoxic 
compounds which are detoxified by GSTs on cells of the invention, as described 
below. Thus, an appropriate herbicide can be used to select transformants. 

20 

The regeneration step gives rise to a first generation transgenic plant. The invention 
also provides methods of obtaining transgenic plants of further generations this first 
generation plant. These are known as progeny transgenic plants, progeny plants of 
second, third fourth, fifth, sixth and further generations may be obtained fix>m the first 
2 5 generation transgenic plant by any means known in the art. 

Thus, the invention provides a method of obtaming a transgenic progeny plant 
comprising obtaining a second-generation transgenic progeny plant fix>m a first- 
generation transgenic plant of the invention, and optionally obtaining transgenic plants 
30 of one or more fiirther generations from the second-generation progeny plant thus 
obtained. 
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Such progeny plants are desirable because the first generation plant may not have all 
the characteristics required for cultivation. For example, for the production of first 
generation transgenic plants, a plant of a taxon that is easy to transform and regenerate 
5 may be chosen. It may therefore be necessary to introduce further characteristics in one 
or more subsequent generations of progeny plants before a transgenic plant more 
suitable for cultivation is produced. 

Progeny plants may be produced form their predecessors of earlier generations by any 
10 known technique. In particular, progeny plants may be produced by: 

obtaining a transgenic seed from a transgenic plant of the invention belonging to a 
previous generation, then obtaining a transgenic progeny plant of the invention 
belonging to a new generation by growmg up the transgenic seed; 

15 

and/or 

propagating clonally a transgenic plant of the invention belonging to a previous 
generation to give a transgenic progeny plant of the invention belonging to a new 
20 generation; 

and/or 

crossmg a first-generation a transgenic plant of the invention belonging to a previous 
25 generation with another compatible plant to give a transgenic progeny plant of the 
invention belonging to a new generation; 

and optionally; 

30 

obtaining transgenic progeny plants of one or more further generations from the 
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progeny plant thus obtained. 

These techniques may be used in any combination, for example, clonal propagation 
and sexual propagation may be used at di£ferent points in a process that gives rise to a 
5 transgenic plant suitable for cultivation. In particular, repetitive back-crossing with a 
plant taxon with agronomically desirable characteristics may be undertaken. Further 
steps of removing cells from a plant and regenerating new plants therefrom may also 
be carried out. 

10 Also, further desirable characteristics may be introduced by transforming the cells, 
plant tissues, plants or seeds, at any suitable stage in the above process, to introduce 
desirable coding sequences other than the polynucleotides of the invention, this may be 
carried out by the techniques described herein for the introduction of polynucleotides 
of the invention. 

15 

For example, further transgenes may be selected from those coding for other herbicide 
resistance traits; e.g. tolerance to Glyphosate (e.g. using an EPSP synthase gene (e.g. 
EP-A-0 293,358) or a glyphosate oxidoreductase (WO 92/000377) gene); or tolerance 
to fosametin; a dihalobenzonitrile; glufosinate (e.g. using a phosphinotricyine acetyl 

20 transferase or glutamine synthase gene (cf. EP-A-0 242,236); asulam (e.g. using a 
dihydropteroate synthase gene (EP-A-0 369,367); or a sulphonylurea (e.g. using an 
ALS gene); diphenyl ethers such as acifluorfen or oxyfluorfen (e.g. using a 
protoporphyrogen oxidase gene); an oxadiazole such as oxadiazon; a cyclic imide such 
as chlorophthalim; a phenyl pyrrazole such as TNP, or a phenopylate or carbamate 

25 analogue thereof. 

Similarly, genes for beneficial properties other than herbicide tolerance may be 
introduced. For example, genes for insect resistance may be introduced, notably genes 
encodmg Bacillus thuringiensis (Bt) toxins. 
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Transgenic plant cells, plant parts and tissues, plants and seeds of the invention 

The invention also provides transgenic plant cells, plant parts and tissues, plants and 
seeds, these are typically obtainable, or obtained, by the methods described above. 
5 They may be of any botanical taxon, e.g. any species or lower taxonomic grouping. 
Preferably, they are of a crop pant species. 

Transgenic plant cells, plant parts and tissues, plants and seeds of the invention may 
thus be of a monocotyledonous (monocot) or dicotyledonous (dicot) taxon. Preferred 
10 dicot crop plants include tomato; potato; sugarbeet; cruciferous crops, including 
oilseed rape; linseed; tobacco; sunflower; fibre crops such as cotton; and leguminous 
crops such as peas, beans, especially soybean, and alfalfa. Preferred monocots include 
graminaceous plants such as wheat, maize, rice, oats, barley and rye, sorghum, triticale 
and sugar cane. Wheat is particularly preferred. 

15 

Typically, a polypeptide of the invention is expressed in a plant of the invention, 
depending on the promoter used, this expression may be constitutive or inducible, e.g. 
by a herbicide safener. similarly, it may be tissue- or stage-specific, i.e. directed 
towards a particular plant tissue or stage in plant development. 

20 

Preferably, plant cells, plant parts and tissues, plants and seeds of the invention exhibit 
herbicide resistance due, at least in part, to expression of a polypeptide of the 
invention. 

25 Herbicides to which plants of the invention may be resistant include Fluorodifen» 
Fenoxapiop-ethyl, Metolachlor, Alpha-Metolachlor, Acetochlor, Alachlor, Pretilachlor, 
Fluthiamid, Dimethenamid, 5-Dimethenamid, Flupyrsuifiiron-methyl, Trifiusulfiiron- 
methyl, Acifiuorfen, Chlorimuron-ethyl, Fomesafen, Atrazme, Simazine, Cyanazine, 
and Metribuzin. Particularly preferred herbicides include Fenoxaprop-ethyl, 

30 Flupyrsulfiiron-metfayl, Fluthiamid, Acetochlor, Metolachlor and Alpha-Metolachlor. 
Plants of the invention may also exhibit resistance to other herbicides capable of 
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conjugation to GSH by GSTs or to other non-herbicide phytotoxic substances. 

Preferably, a transgenic plant of the invention exhibits resistance to one or more of 
Fenoxaprop-ethyU Flupyrsulfuron-methyl, Fluthiamid, Acetochlor, Metalochlor and 
5 Alpha-Metolachlor. Resistance may be exhibited to herbicides which are selective for 
particular plant taxa and/or herbicides which are generic to all plants. 

Uses of the polynucleotides, polvpeptides. antibodies^ probes and plants of the 
invention 

10 

Apart from enabling the generation of herbicide- resistant plants, the invention has a 
number of other uses. 

Selectable markers 

15 

Polynucleotides of the invention can be used as selectable maricers for detecting the 
transformation of plant cells. When expressed torn polynucleotides of the invention, 
the polypeptides of the invention are capable of conferring herbicide resistance on cells 
of the invention, as described herein. Thus, an appropriate herbicide can be used to 
20 select transformants. 

Accordingly, the invention provides a nucleic acid construct comprising: 

(a) a polynucleotide of the invention operably linked to regulatory elements that 
25 allow expression of a polynucleotide of the invention a plant cell; and 

(b) a site into which a fiuther polynucleotide comprising a coding sequence can be 
inserted. 

3 0 Preferably, site (b) is bounded by regulatory elements that allow expression of a coding 
sequence inserted at the site in a plant cell. 



38 



wo 99/14337 



P€rr/GB98/02802 



These constructs may be contained within vectors as described herein. 

In these constructs, site (b) is a site into which another nucleic acid sequence can be 
5 inserted, in cells transformed with the constructs or vectors containing them, 
expression of the polypeptide of the invention can be used as a selectable marker, 
indicating that the polynucleotide at site (b) has also been successfully introduced. 

In this connection, the invention also provides a method of transforming a plant cell or 
1 0 of obtaining a plant cell culture or transgenic plant comprising: 

(a) providing an xmtransformed plant cell which is susceptible to a herbicide whose 
herbicidal activity is reduced by a dimeric protein of the invention; 

15 (b) transforming the plant cell with a vector comprising a marker construct of the 
invention; 

(c) cultivating the transformed cell under conditions that allow the expression of a 
polypeptide of the invention; 

20 

and /or 

(c') regenerating the cell to give a cell culture or plant such that a polypeptide of the 
invention is expressed; 

25 

and 

(d) contacting the ceil, cell culture or plant with the herbicide whose herbicidal activity 
is reduced by a dimeric protein of the invention, and to which herbicide the 

3 0 untransformed plant cell was susceptible; 
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and 

(e)selecting cells, cell cultures or plants that are less susceptible to the herbicide than 
are corresponding untransformed cells, cell cultures or plants. 

5 

Identification of novel herbicides 

The polypeptides and dimeric proteins of the invention may be used to identify 
compounds capable of conjugation to GSH. Thus, as conjugation to GSH is the 

10 mechanism by which GSTs are believed to effect detoxification of herbicides, the 
polypeptides of the invention can be used to determine whether or not a candidate 
herbicidal compound will be detoxified by GSTs, for example the dimeric proteins of 
the invention. In this case, it may be possible to develop the candidate compound as a 
herbicide. In particular, it may be possible to develop the candidate compound for 

15 selective use as a herbicide on crops of wheat, or of a wheat-related species, or of other 
plants (cf Feldman et al supra), having GSTs with similar activity to the dimeric 
proteins of the invention. Hiis is because species having such GSTs can be expected to 
detoxify herbicides identified in the assay. 

20 Accordingly, the invention provides a method of identifying compounds capable of 
conjugation to glutathione comprising: 

(a) contacting a candidate compound suspected of being capable of being 
metabolised by glutathione transferase with glutathione (GSH) in the presence of a 

25 dimeric protein of the invention; and 

(b) determining whether or not metabolism of the candidate compound takes 
place, or to what extent takes place. 

30 Preferably, metabolism of the compound is detected by determining whether, or to 
what extent, conjugation of the candidate compound to GSH takes place. 
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Such assay methods may be carried out by any suitable means known in the art. 
Compounds may be assayed singly, or, preferably, in batches containing several 
compounds. For example, microtitre plate-based assay techniques may be used. More 
5 specifically, the techniques of Example 4 below may be used. 

The invention also provides compounds identified by the methods of the invention. 

The invention also provides a kit for detecting compounds capable of being 
10 metabolised by a GST comprising: 

(a) reduced glutathione, hydroxymethylglutathione or homoglutathione; and 

(b) a dimeric protein of the invention. 

15 

Such kits may also comprise other components, especially buffer solutions, e.g. 
aqueous solutions buffered at a suitable pH (e.g. pH7 to pHlO, preferably pH7 to pH8). 

These kits can be used in the identification of novel herbicides. 

20 

Identification of compounds that induce GST expression 

We have found that expression of the GSTs of the invention is induced by herbicide 
safeners. As GSTs are implicated m herbicide resistance, it may be desirable to identify 
2 5 other compounds capable of inducing their expression or that of related GSTs in wheat 
or other plants, preferably graminaceous plants. Such compounds may, for example, be 
used to induce e3q)ression of GSTs involved in herbicide tolerance. This will be 
beneficial as it will allow crop plants to be selectively protected firom herbicides whilst 
weeds are killed by them. 

30 

Accordingly, the invention provides a method of identifying compounds that induce 
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GST expression in graminaceous plants comprising: 

(a) contacting a plant, preferably a graminaceous plant, or a cell or cell culture thereof, 
with a candidate compound suspected of being capable of inducing GST expression; 

5 and 

(b) determining the level of GST expression in the plant, cell or cell culture. 

Typically, the level of expression is also determined before the compound is added, or 
10 in an untreated sample, in order to provide a control. If the level of GST expression in 
the test sample is higher than that in the control sample then the candidate compound is 
an inducer of GST expression. 

Preferably, the level of GST expression is determined quantitatively although, in 
15 certain situations, quantitative detection may suffice, e.g. where the level of expression 
is zero or undetectable in the absence of an inducer. 

Determination of the level of GST expression may be performed by any suitable 
means. Preferably, it is performed using antibodies or probes of the invention, as 
20 described herein. 

The invention also provides compounds identified by these methods. 

Antibodies that specifically recognise the polypeptides or dimeric proteins of the 
25 invention can be used to detect and preferably quantify GST expression by detecting 
them directly. The antibodies of the invention may thus be used for detecting 
polypeptides or dimeric proteins of the invention present in plant samples, e.g. by a 
method which comprises: 

30 (a) providing an antibody of the invention; 

(b) incubating a plant sample with said antibody under conditions which allow for 
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the formation of an antibody-antigen complex; and 

(c) deteimining, by any suitable technique known in the art, whether antibody- 
antigen complex comprising said antibody is formed. 

5 Antibodies of the invention may be bound to a solid support and/or packaged into kits 
in a suitable container along with suitable reagents, controls, instructions and the like. 

Similarly, polynucleotides or primers of the invention or fragments thereof, labelled or 
unlabelled, may be used by a person skilled in the art in nucleic acid-based tests for 
10 detecting nucleic acid sequences of the invention in a sample taken from a plant, 
typically a wheat plant. 

Such tests generally comprise bringing a sample containing DNA or RNA into contact 
with a probe comprising a polynucleotide or primer of the invention under hybridising 
15 conditions and detecting any duplex formed between the probe and nucleic acid in the 
sample. Such detection may be achieved using techniques such as PCR or by 
immobilising the probe on a solid support, removing nucleic acid in the sample which 
is not hybridised to the probe, and then detecting nucleic acid which has hybridised to 
the probe. 

20 

Alternatively, the sample nucleic acid may be immobilised on a solid support, and the 
amount of probe botuid to such a support can be detected. 

The probes of the invention may conveniently be packaged in the form of a test kit in a 
25 suitable container. In such kits the probe may be bound to a solid support where the 
assay format for which the kit is designed requires such binding. The kit may also 
contain suitable reagents for treating the sample to be probed, hybridising the probe to 
nucleic acid in the sample, control reagents, instructions, and the like. 
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Owing to the secondary activity of the GSTs of the invention as glutathione 
peroxidases, the polypeptides and dimeric proteins of the invention will also have 
5 applications in determining the quality of batches of seed and flour, especially of wheat 
seed, grain and wheat floun In such batches, glutathione peroxidases are involved in 
reducing lipid hydroperoxides, which reduces the amount of GSH available. In 
particular, this occurs during bread making. Thus, it is desirable to be able to monitor 
the level of GSTs having glutathione peroxidase activity in batches of seed and flour. 

10 

This can be done by any suitable means. For example, antibodies of the invention can 
be used to detect polypeptides or dimeric proteins of the invention, as described above. 
Similarly, probes of the invention can be used to detect GST mRNA, as described 
above. 

15 

Alternatively, to determme directly the level of GSH in a batch, the invention provides 
a method of determining the GSH level in a batch of seed or flour comprising: 

(a) contacting a sample from the batch with a polypeptide or duneric protein of the 
20 invention and a compound whose conjugation to GSH is catalysed by the polypeptide 

or protein; and 

(b) determining the GSH level from the extent of reaction between the compound 
and GSH. 

25 

Controlling the CTowth of weeds 

The invention also provides a method of controlling the growth of weeds at a locus 
where a transgenic plant of the invention is being cultivated, which method comprises 
3 0 applying a herbicide to the locus. Any amount of herbicide may be used, as long as it is 
herbicidally effective against the weeds but leaves the herbicide resistant plants of the 
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invention unaffected, or substantially unaffected. The effect on the weeds may be, for 
example, to kill them or to inhibit their growth. 

Any type of weed that responds to a particular herbicide may be controlled in this way. 
5 Alopecurus myosuroides, Avem fatua, Lolium spp„ Bromus spp, Poa annua, Galium 
aparim, Aper spica-venti, Matricaria inodora, Stellaria media, Papaver rhoeas, 
Polygonum spp„ Setaria sp.. Sorghum halapense, Panicum miliaceum, Echinochloa 
spp., Digitaria sanguinalis, Phalaris minor, Abutilon theophrasti, Amaranthus 
retroflexus, chenopodium album, Datura stramoniuon, Solanum nigrum, Xanthium 
10 strumarium, saggitaria spp,, Monochoria vaginalis, Lindernia spp., Eleokaris 
kurogaai, Scirpus juncoides, Cyperus spp. 

The herbicide may, for example, be one whose activity is identified by the methods of 
the invention (see above). Alternatively, it may be a knovm herbicide, for example one 
15 of the herbicides mentioned herein. 

The herbicide may be applied at any suitable time during the life cycle of the 
transgenic plant, for example pre-emergence or post-emergence. Timing of application 
will be tailored to the development of the weeds which it is desired to control. Where 
20 inducible or tissue- and/or stage- specific expression of the active dimer of the 
invention is employed, timing of herbicide application will be tailored to the optimal 
expression of the invention in the course of the development of the transgenic plant of 
the invention. 

2 5 The following Examples illustrate the invention. 
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EXAMPLES 

Example 1: Isolation and characterisation of the nucleotide sequence 1 encoding 
TViGSTl 

5 

fa) Purification of wheat GST isoenzymes 

Wheat GST isoenzymes contaming the TaGSTl subunit were purified by the method 
of Dixon et al (Pestic. Sci. 1997, 50, 72-82). This is summarised below. 

10 

Wheat seeds (Triticum aestivum L. var. Hunter) were imbibed in a 10 mg/1 solution of 
the herbicide safener fenchlorazole-ethyl and then grown in an environmental growth 
room with further root-applied watering treatments of 5 mg/1 fenchlorazole ethyl 
applied as required. At 10 days after imbibing, the shoot tissue was harvested and 

15 extracted prior to precipitation of the protein with ammonium sulphate (80% 
saturation). The total protein extract was then applied m the presence of 1 M 
ammonium sulphate to a phenyl-Sepharose column. The bound GSTs were then 
recovered, firstly by reducing the ammonium sulphate concentration to 0 M to yield 
the polar GST fraction, which represented 61% of the recovered activity toward 1- 

20 chloro-2,4-dinitrobenzene (CDNB). The remaining 39% of the GST activity was then 
recovered by adding ethylene glycol (50 % v/v) to the nmning buffer to yield the 
hydrophobic GST fractioiL 

The polar and hydrophobic GST fractions were then independently applied to the 
aflBnity matrix, 5-hexyl-glutathione agarose. This matrix bound 90% of the GST 

25 activity toward CDNB. Prior to elution of the column with the ligand, S-hexyl- 
glutathione, the matrix was washed with phosphate buffer, followed by phosphate 
buffer containing 200 mM potassium chloride. The GSTs eluting in this salt wash were 
termed the "loosely-bound" fraction. Tightly-bound proteins were then eluted with 5 
mM 5-hexyl-glutathione. With both the polar and hydrophobic GSTs an average of 

30 34% of the GST activity toward CDNB eluted in the loosely-bound fraction and 66% 
eluted in the presence of S-hexyl-glutathione. The loosely-bound fraction contained the 
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GSTs which will be considered in Example 2. The major wheat GSTs of interest in this 
example were found in the affmity-purified pool and to define the numbers of 
isoenzymes and component subunits present, this pool was analysed in detail. 

5 When the affinity-bound pools of the polar and hydrophobic GSTs were analysed by 
anion-exchange chromatography on Q-sepharose, the partial resolution of the eluting 
activity suggested the presence of multiple isoenzymes (Figure 1). The component 
polypeptides in the active fractions were then analysed by silver staining after 
resolution by sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS- 

10 PAGE). It was concluded that in fenchlorazole-ethyl-treated wheat the polar GSTs 
were composed of 25 kDa and 26 kDa polypeptides, while the hydrophobic fraction 
contained 25kDa, 26 kDa and 24 kDa polypeptides. Further analysis by reversed-phase 
HPLC confirmed the subunit compositions (Figure 2). Based on the combined analyses 
by Q-sepharose, HPLC and SDS-PAGE these GST polypeptides were named as 

15 described in Table 1, which also contains details of how these subunits associate 
together to form the active dimers found in plants and the relative abimdance of these 
subunits in extracts firom imsafened and fenchlorazole-ethyl treated (safened) plants. 

(b) GST activities of the purified TaGST isoenzymes 

20 

The purified isoenzymes were assayed for GST activity toward herbicides using the 
HPLC-based assays described by Edwards R. and Cole D.J. (Pesticide Biochemistry 
and Physiology Vol. 54, pp96-104 (1996)) and the results are presented in Table 2. 
Both polar and hydrophobic GSTs from the afiinity-bound pools of isoenzymes 
25 showed detoxifymg activities toward the selective graminicide fenoxaprop-ethyl, the 
diphenyl-ether herbicide fluorodifen, and the chloracetanilide metolachlor. These 
isoenzymes had additional activities as glutathione peroxidases able to reduce linoleic 
acid hydroperoxide, a major reaction product formed durmg membrane peroxidation in 
plants (Williamson and Beverly, J. Cereal Sci. 8, 1988, 155-163). 

30 

fc) Preparation of polvclonal antibodies to the major wheat GST isoenzvmes 
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Purified TaGSTl-l was used to immunise rabbits to raise polyclonal antibodies to the 
differing isoenzymes. The reactivity of the antiserum toward crude wheat preparations 
was demonstrated with a Western blot of polypeptides resolved by SDS-PAGE. The 
5 antibodies were then used to identify the corresponding cDNAs in an expression 
library. 

(d) Identification and characterisation of a cDNA encoding TaGSTl 

10 An expression library was prepared firom poly (A)+ RNA extracted from 7-day wheat 
shoots grown firom seed treated with fenchlorazole-ethyl. The library was constructed 
in lambda ZAP II (Stratagene) and plaque forming units (pfiis) screened with the 
antiserum raised against TaOSTl-L From an initial screen of 170,000 pfiis 17 positive 
plaques were identified, of which 12 were fiirther ptirified to homogeneity in secondary 

15 and tertiary screens and the wheat cDNAs excised fi'om the phage to form Bluescript 
plasmids in E, coli SOLR. (Stratagene). Automated DNA sequencing showed that all 
clones had an identical coding sequence, although differences in the 5' and 3' 
untranslated regions were apparent, such that of 6 clones sequenced fiiUy on both 
strands, 4 different untranslated regions were observed. Since these clones shared a 

20 common open reading frame they were all designated TaGSTl and then subdivided as 
A, B, C and D. The nucleotide sequence of TaGSTl showing the variable untranslated 
regions of A, B, C and D is shown in SEQ ID No. 1, together with the deduced amino 
acid sequence of the coding region (SEQ ID No. 2). 

25 To confirm that TaGSTl encoded a GST, it was expressed as a fusion protein with 
beta-galactosidase using the pBluescript plasmid in E. coli strain SOLR. TaGSTl 
clones were inoculated into LB liquid medium and were grown overnight at 37/C on an 
orbital shaker in the presence of IPTG. Bacteria were then pelleted by centrifugation, 
lysed by sonication and assayed for GST activity toward CDNB and analysed by SDS- 

30 PAGE and Western blotting using the anti-7aGSTl-l serum. With all six TaGSTl 
clones, GST activity toward CDNB could be determined in the crude extracts in the 
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range 30 - 50 nkat/ mg crude lysate. This was in contrast to control £. coli containing 
the bluescript piasmid without a cDNA insert which showed negligible GST activity 
(0.2 nkat/mg). When the polypeptides contained in the lysates of the various TflGSTl 
clones were analysed by SDS-PAGE, in every case the TViGSTl -fusion protein was 
5 clearly visible as a highly expressed polypeptide relative to the controls. All the fusion 
proteins reacted with the anti-roGSTl serum. 

To confirm that the GST activity in the extracts from ToGSTl clones was due to the 
fusion protein, the GST-fusion was purified using 5-hexyl-glutathione agarose affinity 
10 chromatography. The pure fusion protein was then assayed for enzyme activity toward 
herbicide and hydroperoxide substrates and was found to show a similar spectrum of 
activities to that of the pure 7aGSTl-l isoenzyme from wheat shoots. 
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Table 1 

Summary of the characteristics of major classes of wheat GST isoenzymes. 

5 The GST subunits had the following retention times by reversed-phase HPLC. 
r^GSTla - 26.4 min, raOSTlb - 27.1 mm, TaGSTl - 31.1 min, TaGST3 • 30.9 min, 
raGST4-33.2 min. 



Isoenzyme 

TYPE 


Subunits 


Polar (P) OR 
Hydrophobic 

(H) 


Molecular 

WEIGHT (KDa) 


Anti- 

TaGSTI 
ANTIBODY 
REACTION 


% 

Fmhanppmpwt 
BYSAFENER 


TaGSTl-1 


TaGSTla 


P 


25 


+ 


30-50 




TaGSTlb 


P 


25 


+ 


30-50 














TaGSTl-2 


TaGSTla 


P 


25 


+ 


Only 




TaGSTlb 


P 


25 


+ 


observed 




TaGST2 


P 


26 




with 












safener 














TaGSTl-3 


TaGSTla 


P 


25 


+ 


Only 




TaGSTlb 


P 


25 


+ 


observed 




TaGST3 


H 


26 




with 












safener 














TaGSTl-4 


TaGSTla 


P 


25 


+ 


300% 




TaGSTlb 


P 


25 


+ 


300% 




TaGST4 


H 


24 




300% 
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Table 2 

Activity of GST isoenzymes purified from fenchlorazole-ethyMreated wheat 
shoots. 

5 

Enzyme activities are expressed as nkat.mg"^ 



Isoenzyme 


CDNB 


Fluorodifen 


Fenoxaprop 

-ETHYL 


Metolachlor 












Polar 




















TaGSTl-1 


1,528 


0.97 


0 


0.11 


TaGSTl-2 


1,441 


0.38 


0.61 


0.25 












Hydrophobic 




















TaGSTl-3 


1,700 


0.38 


0.44 


0.28 


TaGSTl-4 


1.553 


0.57 


0.23 


0.23 
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Tables 

CDNB and herbicide activities of recombinant wheat GSTs 
Activities expressed as nkat.mg"^ ± standard error 



Recombinant 


CDNB 


Fluorodifen 


Fenoxaprop- 


Metolachlor 


PM7VK/P 






ETHYL 




TaGSTl 


1970 


2.0 


0 


0.127 






•4- n 1 

X U.l 




t A A1 A 

± 0.014 


WICl 


406.5 


0.136 


0.050 


0.315 




X U.J 


•4- n oi 1 

X U.Ui i 


± U.UlU 


± 0.003 


WIC2 


187 


0.096 


0.085 


0.512 




•4- 1 
X 1 


X u.uuz 


± U.UU2 


1 A A A 

±0.04 


WIC3 


2,519 


0.014 


0.093 


0.053 




±88 


± 0.006 


±0.002 


+ 0 004 


WIC4 


980 


0.036 


0.012 


0.037 




±86 


± 0.004 


± 0.001 


+ 0 003 


WIC5 


174 


0.030 


0.067 


0.040 




±8 


± 0.002 


±0.003 


±0.004 


TA27 


237 


0.034 


0.036 


0.063 




±13 


±0.003 


±0.004 


±0.006 


ICR 


8139 


0.037 


0.028 


0.000 




±146 


± 0.002 


±0.001 


±0.000 


iccrvfp 


30 


0.000 


0.074 


0.000 




±4 


± 0.000 


± 0.008 


±0.000 
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EXAMPLE 2: Cloning of wheat GSTs resembling the type I GSTs from maize. 

(a) Characterisation of type I GSTs in wheat 
5 The observation that extracts from safener-treated wheat shoots contained GSTs which, 
unlike those described in Example 1, were not selectively retained on the affinity 
matrix suggested that a discrete class of GSTs were present in this loosely bound 
fraction. Crude extracts of wheat seedlings were analysed by Western blotting 
following SDS-PAGE using a polyclonal rabbit antiserum raised to the type I 
10 ZmGSTI-II heterodimer. The antiserum reacted strongly with several polypeptides of 
Mr 23 - 27 kDa. These polypeptides were present in the loosely-bound fraction from 
the 5-hexyl-glutathione affinity column, but not m the affinity boxmd fraction. 

(h) Cloning of cDNAs from a wheat expression library 

15 Having established that safener-treated wheat shoots contained polypeptides which 
cross-reacted with the antiserum raised to the maize GSTs, the primary cDNA 
expression library prepared from fenchlorazole-ethyl treated wheat shoots was 
screened with the antibody for positive clones. Following a screen of 170,000 pfu., ten 
positive plaques were identified, with obvious differences in the intensity of 

20 recognition, with four plaques showmg a strong colour reaction and six plaques of 
lower intensity. These cDNA clones were termed WIC clones. All four of the stronger- 
reactmg plaques {WIC 1, 2, 4 and 5) and four of the weaker positives (WIC 3, 7, 8 and 
10) were purified to homogeneity, the respective plasmids excised and DNA 
preparations sequenced. The clones were then grouped by their degree of similarity in 

25 sequence. 

In the sequence listing, WIC 1 is SEQ ID No. 3 and its deduced amino acid sequence is 
SEQ ID No. 4. WIC 2 is SEQ ID No. 5 and its deduced amino acid sequence is SEQ 
ID No. 6. The coding sequences of WIC 3, WIC 7 and WIC 8 were identical in 
30 sequence. The DNA sequence of WIC 3/7/8 is given in SEQ ID No. 7 and the deduced 
amino acid sequence m SEQ ID No. 8 All three sequences contained a stop codon m 
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the 5' untranslated region of the GST gene, although some expression occurred. The 
DNA sequence of WIC 5 is shown in SEQ ID No. 9, and the deduced amino acid 
sequence in SEQ ID No. 10. WIC 4 and WIC 10 had identical coding sequences, but 
differed in their untranslated regions. In particular, WIC 10 had a stop codon in the 5* 
5 untranslated region, though this did not prevent all expression of the fusion protein The 
WIC 4 DNA sequence is given in SEQ ID No. 1 1 and the deduced WIC 4/10 amino 
acid sequence in SEQ ID No. 12 (the WIC 10 DNA sequence is not shown). 

(c) Cloning of wheat GSTs by differential screening of a cDNA library 

10 A further cDNA clone, termed TA 27 was obtained. A cDNA library prepared from 
wheat seedlings treated with the herbicide safener cloquintocet-mexyl, was screened 
for clones which represented mRNAs which were differentially expressed in wheat in 
response to safener application. The identity of the clone as a GST was suggested from 
its nucleotide (SEQ ID No. 13) and deduced amino acid (SEQ ID No. 14) sequence. As 

15 the coding sequence of TA 27 was not in frame with beta-galactosidase in the 
pBluescript vector, the coding sequence was sub-cloned into the expression vector pET 
11a (Novagen), with translation starting at the first ATG codon in the clone, which 
gave a reasonable alignment of the open reading frame with that of other GSTs 
involved in herbicide metabolism, notably the Z/wGSTIV sequence. 

20 

(d) Activity of recombinant GSTs of the invention 

To confirm that the WIC clones and TA 27 encoded functional GSTs the correspondmg 
enzymes were expressed as recombinant enzymes in E, coIL The fiill coding sequence 
of TA 27 was expressed in the pET vector, while the WIC clones were expressed as 

25 fusions with part of the beta-galactosidase enzyme using the pBluescript vector. The 
levels of recombinant protein expressed varied between the differing clones. 
Appreciable amounts of recombinant protein were observed in the TA 27 pET clones 
and in clones WIC I WIC 2, WIC 4 and WIC 5. Western blotting of these total 
bacterial extracts with the antiserum raised to ZmGSTI-II showed that the fiision 

3 0 proteins were selectively recognised by the antiserum. In contrast, use of the antiserum 
demonstrated much lower levels of expression of immunoreactive fusion proteins in 
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clones WIC i, WIC 7. WIC 8 and WIC 10. 

To assay the recombinant fusion proteins for GST activity , the £. coli were grown in 
the presence of IPTG and then pelleted by centrifugation. The bacteria were then lysed 
5 by sonication and the protein precipitated using 80% anunonium sulphate. After 
resuspension and desalting, GSTs were purified by affinity chromatography. The WIC 
3 fusion protein was purified using sulphobromophthaIein-5-glutathione affinity 
chromatography (Mozer et al Biochem. 22, 1983, 1068-1072) while the other WIC 
fusion proteins were purified using glutathione-agarose (Mannervik and Guthenberg. 
10 Methods Enzymol. 77, 1981, 231-235).The purified enzymes were then assayed for 
GST activities toward herbicides (Table 3) and GST activities toward non-herbicide 
substrates and glutathione peroxidase activities toward organic hydroperoxides (Table 
4). 
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Other GST activities and glutathione peroxidase activities. 

5 Activities expressed as nkat.ing'^ ± standard error. Peroxidase activities expressed as 
absorbance change at 340 nm.mg'* ± standard error (n=3). N.D = not detected, 
- not performed. 





CUMENE 

Hydroperoxide 


Benzyl 

ISOTHIOCYANATE 


Crotonaldehyde 


ETHACRYNIC 
ACID 












WIC 1 


18.6 ±0.5 


18.0 ±3.75 


7.1 ±0.7 


N.D. 












WIC2 


28.2+1.7 


33.3 + 4.5 


5.5 ±1.3 


N.D. 












WIC 3 


1.4 ±0.3 


9.0 ±0.5 


6.3 ±0.9 


N.D. 












WIC 4 


6.2 ±0.3 


4.2 ±0.4 


5.5 ±0.6 


1.4 ±0.3 












WIC 5 


1.3 ±0.2 


9.4 ±2.0 


4.5 ±0.5 


N.D. 












TaGSTl 


0.7 ±0.1 


11.8±0 


3.7 ±0.3 


N.D. 












TA27 


3.6 ±0.4 


















ICR 


N.D 


















icc/vn? 


0.84 ±0.04 
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Example 3:CIoning of safener-inducible type III GSTs from wheat. 

A polyclonal antiserum was raised in a rabbit to a mixture of TaGSTl-2 and TaGSTl- 
3. When tested against crude extracts fipom safener-treated wheat shoots the antiserum 
5 recognised both the 25 kDa TaGSTl subunit and the 26 kDa safener-inducible 
TaGST2 and TaGST3 subunits. The antiserum was then used in conjunction with the 
antiserum raised to TaGSTl -1 to immuno screen the cDNA library prepared from 
fenchlorazole ethyl treated wheat shoots as described in example 1. Duplicate lifts were 
taken from the plated out library and the first blot screened with the antiserum raised 

10 against 7bGSTl-'2 and TizGSTl-S. The second blot was screened with the antiserum 
raised to 7ViGSTl-l. Five plaques were identified from the first blot which were absent 
from the second blot corresponding to cDNAs encoding raGST2 or TaGSTS like 
polypeptides and theses clones were purified and the respective plasmids sequenced. 
One of the clones, termed ICJ had an identical nucleotide sequence to GST Tsl, a 

15 safener-inducible GST identified in Triticum tauschii (Riechers et al,, 1997 Plant 
Physiol. 1 14, 1568). Another clone, ICR, though showing some similarity to ICJ had a 
novel coding DNA coding sequence (SEQ ID No. 15) and predicted amino acid 
sequence (SEQ ID No. 16). The other three clones ICC. ICP and ICV had identical 
DNA sequences (SEQ ID No. 17) and predicted amino acid sequence (SEQ ID No. 

20 18). GST ICJ was sub-cloned into the pET 11a vector after using PCR to introduce a 
Nde 1 restriction site into the translation start site, using the primer 5' AGG TAG TTA 
CAT ATG GCC GGA GGA 3' (SEQ ID No. 19) in die amplification, following sub- 
cloning, the sequence of GST ICJ was re-checked to ensure no PCR induced errors had 
been introduced. The recombinant GST ICJ was then expressed in E.coli and purified 

25 by S-hexylglutathione affinity chromatography. The purified GST ICJ was assayed for 
activities as a GST (Table 3) and as a glutathione peroxidase (Table 4). The clone GST 
ICV was expressed in a variety of vectors, but in all cases the recombinant proteins 
proved impossible to purify using a variety of afSnity columns (5^hexylglutathione- 
agarose , 5-bromosulphophthalein glutathione agarose). GST ICV was finally 

30 expressed as the respective beta-galactosidase fusion protein using the Bluescript 
plasmid and assayed for GST activity (Table 3 ) and glutathione peroxidase activity 
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(Table 4) in crude bacterial lysates. The specific activity of GST ICV toward these 
substrates was then calculated by I) subtracting the low levels of GST and GPOX 
activity present due the endogenous activities in the Exoli and ii) determining the 
proportion of protein in the lysate present as recombinant GST ICV iErom SDS-PAGE 
5 analysis, by densitometry analysis of polypeptides stained with Coomassie blue . 

EXAMPLE 4: A microtitre plate - based screen to identify herbicidal molecules 
which are metabolised by GSTs of the invention and may selectively control 
weeds in a crop of wheat or other species such as maize, soybean or rice 

10 

(a) Degradation of candidate herbicides bv wheat GSTs and relationship to crop and 
weed selectivity 

Herbicidal molecules which are degraded by recombinant wheat GSTs may be 
15 predicted to be tolerated by plants of wheat or other crop species. These herbicides may 
be less rapidly degraded in weeds such as black-grass {Alopecurus myosuroides) which 
are desirable to control in a crop of wheat or other species. Herbicides found in a 
laboratory based screen to be metabolised by these GSTs are therefore likely to possess 
useful abilities to selectively control troublesome weeds in a crop of wheat, or other 
20 species such as maize, soybean, rice, cotton, barley, oat, rye, sorhum, triticale, potato, 
sugarcane or sugarbeet. 

(b) A 96 well plate - based assay procedure for identifying novel herbicides degraded 
bv recombinant wheat GSTs 

25 

Novel herbicides arising from a chemical synthesis programme oriented to 
optimisation for selective herbicidal activity and potency may be screened for ability to 
be degraded by a panel of recombinant GSTs using a 96 well microplate assay format 
and subsequent reaction analysis by automated High Pressure Liquid Chromatography 
3 0 (HPLC). This allows for example, the screening of a set of eleven novel herbicides and 
one positive control compound such as CDNB, against a panel of seven recombinant 
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GSTs. An eighth file of wells contains test compounds but lacks GSTs; these wells 
serve to identify non-enzymic reaction of the test compounds with reduced glutathione. 
Alternatively, the array can be configured to screen more test compounds against a 
more limited number of GSTs. For example, fifteen compounds can be screened 
5 against five GSTs or forty seven compounds may be screened with a single mixture of 
GSTs. In all cases, provision is made for a positive control and to test for non-enzymic 
reaction with reduced glutathione. 

Enzyme assays are carried out in a total reaction volume of 100 microlitres. Each 
10 reaction mixture contains 100 micromolar Tris.HCl buffer, pH 7.8, 500 micromolar 
reduced glutathione and where appropriate, 500 micromolar test compound or a 
reference substrate such as CDNB; and 14 micrograms of GST protein. The microplate 
is incubated at 30**C on a variable speed agitator for 30 niinutes and reactions are then 
stopped by the addition of 15 microlitres of 23% perchloric acid solution. The 
15 microplate is then centrifiiged at 2000 g for 15 minutes. 

fc) Reaction analvsis bv automated High Pressure Liquid Chromatoerap hv. 

The separation and analysis of glutathione conjugates of test herbicides may be carried 
20 out using High Pressure Liquid Chromatography (HPLC), for example a Gilson HPLC 
in tandem with corresponding software, for example Gilson Version 7.12 and fitted 
with an appropriate colunm, for example a 5 cm Spherisorb 0DS2 column. Typically, 
separation may be carried out using a two phase solvent system as follows: Phase A: 
water contaimng 0.1% trifluoroacetic acid and 5.5% acetonitrile; Phase B: 100% 
2 5 acetonitrile; flow rate 1 .5 ml per minute; injection volume 20 microlitres. 

The elution gradient may be typically as follows: 10% phase B for one mmute, 
followed by a linear gradient to reach 60% phase B after 8.5 minutes. The gradient is 
fiirther increased to reach 100% phase B at 9 minutes; phase B is continued at 100% 
30 until 1 1.5 minutes and is then reduced in a linear gradient to 10% at 13.5 minutes. A 
further 1.5 minutes at 10% phase B is required to re-equilibrate the colunm. 
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Absorbance signals are detected at 264 nanometres using a suitable UV detector, and 
detect the glutathione conjugate of CDNB, having a retention time of 2.4 minutes, 
resolving this from unreacted CDNB having a retention time of 4.75 minutes. Such 
conditions also allow for the resolution and detection of the glutathione conjugates 
5 arising from the metabolism of other reference herbicides such as metolachlor, 
fenoxaprop, fenoxaprop-ethyl and fluorodifen and also of a variety of novel herbicidal 
compounds identified in the assay. 
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CLAIMS 

1. A polynucleotide encoding a glutathione transferase (GST) subunit, which 
polynucleotide comprises a coding sequence capable of hybridising 
selectively to the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 1 1, 13, 15 or 
17 or to the complement of one of those sequences. 

2. A polynucleotide of claim 1 which is a DNA sequence. 

3. A polynucleotide according to claim 1 or 2 wherein the coding sequence 
encodes the amino acid sequence of SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 
18. 

4. A polynucleotide according to any one of the preceding claims which 
comprises the coding sequence of SEQ ID No. 1, 3, 5, 7, 9, 1 1, 13, 15 or 17 
or a fragment thereof. 

5. A polypeptide which is a GST subunit and comprises the amino acid 
sequence of SEQ ID No. 2, 4, 6, 8, 10, 12, 14, 16 or 18 or a sequence 
substantially homologous thereto, or a fragment of either said sequence. 

6. A polypeptide according to claim 5 encoded by the coding sequence of a 
polynucleotide according to any one of claims 1 to 4. 

7. A dimeric protein comprising two GST subunits, wherein at least one 
subunit is a polypeptide according to claim 5 or 6. 

8. A chimeric gene comprising a polynucleotide according to any one of 
claims 1 to 4 operably linked to regulatory sequences that allow expression 
of the coding sequence in a host cell. 



61 



wo 99/14337 



PCT/GB98/02802 



9. A chimeric gene according to claim 7 wherein the regulatory sequences 
allow expression of the coding sequence in a plant cell. 

10. A vector comprising a polynucleotide according to any one of claims 1 to 4 
or a chimeric gene according to claim 8 or 9. 

11. A vector according to claim 1 0 which is an expression vector. 

12. A cell transformed or transfected with a vector according to claim 1 0 or 1 1 . 

13. A cell according to claim 1 2 which is a prokaryotic cell or a plant cell. 

14. A cell having, integrated into its genome, a chimeric gene according to 
claim 8 or 9. 

15. A cell according to claim 14 which is a plant cell, wherein the chimeric 
gene is a chimeric gene according to claim 9. 

16. A cell according to any one of clahns 12 to 15 further comprising one or 
more further polynucleotide sequences coding for a GST subunit, operably 
linked to regulatory elements that allow expression of the subunit in the 
cell. 

17. A process for the production of a polypeptide according to claim 5 or 6, 
which process comprises: 

(a) cultivating a cell according to any one of claims 12 to 15 under 
conditions that allow the expression of the polypeptide; and 

(b) recovering the expressed polypeptide. 
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18. A process for the production of a dimeric protein according to claim 7, 
which process comprises: 

(a) cultivating a cell according to any one of claims 12 to 16 under 
conditions that allow: 

(i) the expression of the polypeptide according to claim 5 or 6 and, if a 
further polynucleotide sequence as defined in claim 16 is present, 
optionally the expression of a further GST subunit encoded by a further 
polynucleotide, and 

(ii) the association of the GST subunit polypeptide according to claim 5 or , 
6 with another GST subunit polypeptide to form a dimeric protein 
according to claim 7; and 

(b) recovering the dimeric protein so formed. 

1 9. A process accordmg to claim 1 7 or 1 8 wherem the cell is a prokaiyotic cell 
or a plant cell. 

20. A method of obtaining a transgenic plant cell comprising: 

(a) transforming a plant cell with an expression vector according to claim 
1 1 to give a transgenic plant cell, 

and optionally, 

(a') transforming the cell with one or more further polynucleotide 
sequences coding for a GST subunit, operably linked to regulatory elements 
that allow expression of the subunit in the cell. 
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21. A method of obtaining a first-generation transgenic plant comprising: 

(b) regenerating a transgenic plant cell transformed with a vector according 
to claim 1 1 to give a transgenic plant. 

22. A method of obtaining a transgenic plant seed comprising: 

(c) obtaining a transgenic seed from a transgenic plant obtainable by step 

(b) of claim 21. 

23. A method of obtaining a transgenic progeny plant comprising obtaining a 
second-generation transgenic progeny plant from a first-generation 
transgenic plant obtainable by a method according to claim 21, and 
optionally obtaining transgenic plants of one or more further generations 
from the second-generation progeny plant thus obtained. 

24. A method according to claim 23 comprising: 

(c) obtaining a transgenic seed from a first-generation transgenic plant 
obtainable by the method according to claim 21, then obtaining a second- 
generation transgenic progeny plant from the transgenic seed; 

and/or 

(d) propagating clonally a first-generation transgenic plant obtainable by 
the method according to claim 21 to give a second-generation progeny 
plant; 

and/or 

(e) crossing a first-generation transgenic plant obtainable by a method 
according to claim 21 with another plant to give a second-generation 
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progeny plant; 
and optionally; 

(f) obtaining transgenic progeny plants of one or more further generations 
from the second-generation progeny plant thus obtained. 

25. A transgenic plant cell, first-generation plant, plant seed or progeny plant 
obtainable by a method according to any one of claims 20 to 24. 

26. A transgenic plant or plant seed comprising plant cells according to claim 
13 or 15. 

27. A transgenic plant cell callus comprising plant cells according to claim 13 
or 15, or obtainable from a transgenic plant cell, first-generation plant, 
plant seed or progeny plant according to claim 25. 

28. Use of a polynucleotide according to any one of claims 1 to 4 as a 
selectable marker for detecting transformation of a plant cell. 

29. A nucleic acid construct comprising: 

(a) a polynucleotide according to any one of claims 1 to 4 operably linked 
to regulatory elements that allow expression of the coding sequence in a 
plant cell; and 

(b) a site into which a further polynucleotide comprising a coding sequence 
can be inserted. 

30. A nucleic acid construct according to claim 29 wherein site (b) is bounded 
by regulatory elements that allow expression of a coding sequence inserted 



65 



wo 99/14337 



PCT/GB98/02802 



at the site in a plant cell. 

31. A vector comprising a construct according to claim 29. 

32. A method of transforming a plant cell or of obtaining a plant cell culture or 
transgenic plant comprising: 

(a) providing an untransformed plant cell which is susceptible to a 
herbicide whose herbicidal activity is reduced by a dimeric protein 
according to claim 7; 

(b) transforming the plant cell with a vector according to claim 29 or 30; 

(c) cultivating the transformed cell under conditions that allow the 
expression of the polynucleotide (a) in the construct according to claim 29 
or 30; and/or 

(c*) regenerating the cell to give a cell culture or plant such that the 
polynucleotide (a) in the construct according to claim 29 or 30 is expressed; 
and 

(d) contacting the cell, cell culture or plant with the herbicide whose 
herbicidal activity is reduced by the dimeric protein according to claim 7, 
and to which the untransformed plant cell was susceptible; and 

(e) selecting cells, cell cultures or plants that are less susceptible to the 
herbicide than are corresponding untransformed ceils, cell cultures or 
plants. 

33. Use of a dimeric protein according to claim 7 in a method of identifying 
compounds capable of metabolism by a GST. 
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34. A method of identifying compounds capable of being metabolised by a 
glutathione transferase comprising: 

(a) contacting a candidate compound suspected of being capable of being 
metabolised by glutathione transferase with glutathione (GSH) in the 
presence of a dimeric protein according to claim 7; and 

(b) determining whether or not metabolism of the candidate compound 
takes place. 

35. A method according to claim 34 wherein metabolism of the compound is 
detected by determining whether or not it is conjugated to glutathione by 
the dimeric protein according to claim 7. 

36. A kit for detecting compounds capable of being metabolised by a GST 
comprising: 

(a) reduced glutathione, hydroxymethylglutathione or homoglutathione; 
and 

(b) a dimeric protein according to claim 7. 

37. An antibody which specifically recognises a polypeptide according to claim 
5 or 6 or a dimeric protein according to claim 7. 

38. A nucleic acid probe which selectively hybridises to the sequence of SEQ 
ID No. 1,3, 5, 7, 9, 11, 13, 15 or 17. 

39. A method of identifying compoxuids that induce GST expression in 
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graminaceous plants comprising: 

(a) contacting a graminaceous plant, or a cell or cell culture thereof, with a 
candidate compound suspected of being capable of inducing GST 
expression; and 

(b) determining the level of GST expression in the plant, cell or cell culture. 

40. A method according to claim 39 wherein, in step (b), the level of 
expression is determined by: (i) determining the level of GST protein 
present by using an antibody according to claim 35; or (ii) determining the 
level of GST mRNA present using a probe according to claim 37. 

41 . A kit for identifying compounds that induce GST expression in plants by a 
method as defined m claim 37 or 38, which kit comprises an antibody as 
defined in claim 36. 

42. A method of determining the GST level in a sample of seed or flour 
comprising: 

(i) determining the level of GST protein present by using an antibody 
according to claim 35; or 

(ii) determining the level of GST mRNA present using a probe according to 
claim 37. 

43. A method of controlling the growth of weeds at a locus where a transgenic 
plant according to any one of claims 25 to 27 is being cultivated, which 
method comprises applying to the locus a herbicide whose herbicidal 
properties are reduced by a dimeric protein according to claim 7. 
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44. A compound identified by a method according to any one of claims 34, 35, 
39 or 40. 

45. A polynucleotide according to claim 1 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

46. A polypeptide according to claim 5 substantially as hereinbefore described 
with reference to any one of the preceding Examples. 

47. A dimeric protein according to claim 7 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

48. A chimeric gene according to claim 8 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

49. A vector according to claim 10 substantially as hereinbefore described with 
reference to any one of the preceding Examples. 

50. A cell according to claim 12 substantially as hereinbefore described with 
reference to any one of the preceding Examples. 

51. A process according to claim 17 or 18 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

52. A method according to claims 20, 2 1 , 22 or 23 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

53. A transgenic plant cell, first-generation plant, plant seed or progeny plant, 
plant or plant seed, or plant cell callus according to any one of claims 25 to 
27 substantially as hereinbefore described with reference to any one of the 
preceding Examples. 
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54. Use according to claim 28 substantially as hereinbefore described with 
reference to any one of the preceding Examples. 

55. A nucleic acid construct according to claim 29 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

56. A vector according to claim 3 1 substantially as hereinbefore described with 
reference to any one of the preceding Examples. 

57. A method according to claim 32 substantially as hereinbefore described 
with reference to any one of the preceding Examples. 

58. Use according to claim 33 substantially as hereinbefore described with 
reference to any one of the preceding Examples. 

59. A method according to claim 34 substantially as hereinbefore described 
with reference to any one of the preceding Examples. 

60. An antibody according to claim 37 substantially as hereinbefore described 
with reference to any one of the preceding Examples. 

61. A nucleic acid probe according to claim 38 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

62. A method according to claims 39, 42 or 43 substantially as hereinbefore 
described with reference to any one of the preceding Examples. 

63. A compound according to claim 44 substantially as hereinbefore described 
with reference to any one of the preceding Examples. 
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Fig.2B. 




30 

Time (min) 



35 



< 

o 
o 

I 

E 
c 
o 

00 
CM 

Q) 
O 

c 

CO 

o 

W — r 

< 20 



GSTTala 



Fig.2C. 

2 

GSTTalb 



25 



1 — 

30 

Time (min) 



35 



SUBSrmjTE SHEET (RULE 26) 



wo 99/14337 



1 



PCT/GB98/02802 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: RHONE-POULENC AGRICULTURE LIMITED 

(B) STREET: FYFIELD ROAD 

(C) CITY: ONGAR 

(D) STATE: ESSEX 

(E) COUNTRY: UNITED KINGDOM 

(F) POSTAL CODE (ZIP): CMS OHW 

(ii) TITLE OF INVENTION: GLUTATHIONE TRANSFERASES 
(iii) NUMBER OF SEQUENCES: 19 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 
(D) SOFTWARE: Patentin Release #1.0. 

Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1085 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 46.. 711 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION :1.. 1085 

(D) OTHER INFORMATION: /note= "SEQUENCE OF 7"dGSTl AND 
ENCODED AMINO ACID SEQUENCE" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CAAACACAAG CACAGATCGG TCGAGAHCA AGGCAACCGG GAGCA ATG GCG GGC 54 

Met Ala Gly 



GAG AAG 
Glu Lys 


GGG 
Gly 


CTG 
Leu 


GTG 
Val 


CTG 
Leu 


CTG 
Leu 


GAC 
Asp 


TTC 
Phe 


TGG 
Trp 


GTG 
Val 


AGC 
Ser 


CCG 
Pro 


TTC 
Phe 


GGG 
Gly 


CAG 
Gin 


102 


CGC GTG 
Arg Val 


CGC 
Arg 


ATC 
He 


GCG 
Ala 


CTG 
Leu 


GCC 
Ala 


GAG 
Glu 


AAG 

Lys 


GGC 
Gly 


CTG 
Leu 


CCC 
Pro 


TAC 
Tyr 


GAG 
Glu 


TAC 
Tyr 


GCG 
Ala 


150 


GAG GAG 
Glu Glu 


GAC 
Asp 


CTG 
Leu 


ATG 
Met 


GCC 
Ala 


GGC 
Gly 


AAG 
Lys 


AGC 
Ser 


GAC 
Asp 


CGC 
Arg 


CTC 
Leu 


CTC 
Leu 


CGC 
Arg 


GCC 
Ala 


AAC 
Asn 


198 


CCG GTG 
Pro Val 


CAT 
His 


AAG 
Lys 


AAG 
Lys 


ATC 
He 


CCG 
Pro 


GTG 
Val 


CTC 
Leu 


CTC 
Leu 


CAC 
His 


GAC 
Asp 


GGC 
Gly 


CGT 
Arg 


GCC 
Ala 


GTC 
Val 


246 


AAC GAG 
Asn Glu 


TCC 
Ser 


CTC 
Leu 


ATC 
He 


ATC 
He 


CTC 
Leu 


CAG 
Gin 


TAC 
Tyr 


CTG 
Leu 


GAG 
Glu 


GAG 
Glu 


GCC 
Ala 


nc 

Phe 


CCG 
Pro 


GAC 
Asp 


294 


GCG CCC 
Ala Pro 


GCT 
Ala 


CTG 
Leu 


CTC 
Leu 


CCC 
Pro 


TCC 
Ser 


GAC 
Asp 


CCC 
Pro 


TAC 
Tyr 


GCG 
Ala 


CGC 
Arg 


GCG 
Ala 


CAG 
Gin 


GCC 
Ala 


CGC 
Arg 


342 


TTC TGG 
Phe Trp 


GCC 
Ala 


GAC 
Asp 


TAC 
Tyr 


GTC 
Val 


GAC 
Asp 


AAG 

Lys 


AAG 
Lys 


GTC 
Val 


TAC 
Tyr 


GAC 
Asp 


TGC 
Cys 


GGC 
Gly 


TCC 
Ser 


CGC 
Arg 


390 


CTC TGG 
Leu Trp 


AAG 
Lys 


CTC 
Leu 


AAG 
Lys 


GGC 
Gly 


GAG 
Glu 


CCG 
Pro 


CAG 
Gin 


GCG 
Ala 


CAG 
Gin 


GCG 
Ala 


CGC 
Arg 


GCC 
Ala 


GAG 
Glu 


ATG 
Met 


438 


CTG GAC 
Leu Asp 


ATC 
He 


CTC 
Leu 


AAG 
Lys 


ACC 
Thr 


CTC 
Leu 


GAC 
Asp 


GGC 
Gly 


GCG 
Ala 


CTC 
Leu 


GGG 
Gly 


GAC 
Asp 


AAG 
Lys 


CCC 
Pro 


nc 

Phe 


486 


TTC GGC 
Phe Gly 


GGC 
Gly 


GAC 
Asp 


AAG 
Lys 


TTC 
Phe 


GGG 
Gly 


TTC 
Phe 


GTC 
Val 


GAC 
Asp 


GCC 
Ala 


GCC 
Ala 


nc 

Phe 


GCG 
Ala 


CCC 
Pro 


TTC 
Phe 


534 


ACC GCG 
Thr Ala 


TGG 
Trp 


nc 

Phe 


CAC 
His 


AGC 
Ser 


TAC 
Tyr 


GAG 
Glu 


AGG 
Arg 


TAC 
Tyr 


GGC 
Gly 


GAG 
Glu 


nc 

Phe 


AGC 
Ser 


CTG 
Leu 


CCG 
Pro 


582 


GAG GTG 
Glu Val 


GCG 
Ala 


CCC 
Pro 


AAG 
Lys 


ATC 
He 


GCC 
Ala 


GCG 
Ala 


TGG 
Trp 


GCC 
Ala 


AAG 
Lys 


CGC 
Arg 


TGC 
Cys 


GGC 
Gly 


GAG 
Glu 


CGG 
Arg 


630 


GAG AGC 
Glu Ser 


GTC 
Val 


GCC 
Ala 


AAG 
Lys 


AGC 
Ser 


CTC 
Leu 


TAC 
Tyr 


TCG 
Ser 


CCG 
Pro 


GAC 
Asp 


AAG 
Lys 


GTG 
Val 


TAC 
Tyr 


GAC 
Asp 


TTC 
Phe 


678 


ATC GGC 


CTG 


CTC 


AAG 


AAG 


AAG 


TAC 


GGC 


ATC 


GAG 


TA GGCGCGCCGA 




723 



He Gly Leu Leu Lys Lys Lys Tyr Gly He Glu 
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CGGACGGACG GACGGGCCAT GCAGGCGACA GCCGGCCCGC CGTCCGGAGG GAAGCAACAA 783 
ATAAATCAGG GAGCGATTTG GGTGGCCTAC AATGCGTACG TCTGGATAGA GTATTTCTTT 843 
CTTTCTiTCT TCGTGGAATA AAGTGCTCCG TGTGTGTGTG GnGGTGGTT GnGGHGGA 903 
TCAGTCAGTG TGTGTGGGTG CGIGHGIGT ACTCAGTACT CGTGATGTGT GTGTGTGTCA 963 
ATGTGTCAAC CCTGGTCnC GGT6GGGGCA GCACCGAGTT GCCACCTGCC ATTCCATrTC1023 
CATTCCGGGC GATGAATAAA HAAAAAAGA GTCTCATTTG TTTAAAAAAA AAAAAAAAAA1083 
AA 1085 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Gly Glu Lys Gly Leu Val Leu Leu Asp Phe Trp Val Ser Pro 
15 10 15 

Phe Gly Gin Arg Val Arg He Ala Leu Ala Glu Lys Gly Leu Pro Tyr 
20 25 30 

Glu Tyr Ala Glu Glu Asp Leu Met Ala Gly Lys Ser Asp Arg Leu Leu 
35 40 45 

Arg Ala Asn Pro Val His Lys Lys He Pro Val Leu Leu His Asp Gly 
50 55 60 

Arg Ala Val Asn Glu Ser Leu He He Leu Gin Tyr Leu Glu Glu Ala 
65 70 75 80 

Phe Pro Asp Ala Pro Ala Leu Leu Pro Ser Asp Pro Tyr Ala Arg Ala 
85 90 95 

Gin Ala Arg Phe- Trp Ala Asp Tyr Val Asp Lys Lys Val Tyr Asp Cys 
100 105 110 
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Gly Sen Arg Leu Trp Lys Leu Lys Gly Glu Pro Gin Ala Gin Ala Arg 
115 120 125 

Ala Glu Met Leu Asp He Leu Lys Thr Leu Asp Gly Ala Leu Gly Asp 
130 135 140 

Lys Pro Phe Phe Gly Gly Asp Lys Phe Gly Phe Val Asp Ala Ala Phe 
145 150 155 160 

Ala Pro Phe Thr Ala Trp Phe His Ser Tyr Glu Arg Tyr Gly Glu Phe 
165 170 175 

Ser Leu Pro Glu Val Ala Pro Lys He Ala Ala Trp Ala Lys Arg Cys 
180 185 190 

Gly Glu Arg Glu Ser Val Ala Lys Ser Leu Tyr Ser Pro Asp Lys Val 
195 200 205 

Tyr Asp Phe He Gly Leu Leu Lys Lys Lys Tyr Gly He Glu 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 865 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 54.. 725 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION :1.. 865 

(D) OTHER INFORMATION :/note= "WICl SEQUENCE AND ENCODED 
ICl AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGAACTCAAC CATTGATCTT CAAGAAGCGG AAGCAAACAG AGCAAAAGGT GTG ATG 56 

Met 
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GCG GCG CCG GC6 GTG AAG GTG TAG GGG TGG GCG ATG TCG CCG TTC GT6 104 
Ala Ala Pro Ala Val Lys Val Tyr Gly Trp Ala Met Ser Pro Phe Val 

GCG CGC GCG CTG CTG TGC CTG GAG GAG 6CC GGC GTG GAG TAC GAG CTC 152 
Ala Arg Ala Leu Leu Cys Leu Glu Glu Ala Gly Val Glu Tyr Glu Leu 

GTC CGC ATG AGC CGC GAG GCC GGC GAC CAC CGC CAG CCC GAC TTC CTC 200 
Val Pro Met Ser Arg Glu Ala Gly Asp His Arg Gin Pro Asp Phe Leu 

GCC CGG AAC CCC TTC GGC CAG GTC CCC GTT CTC GAG GAC GGC GAC CTC 248 
Ala Arg Asn Pro Phe Gly Gin Val Pro Val Leu Glu Asp Gly Asp Leu 

ACC ATC TTC GAG TCG CGC GCC GTC GCG AGG CAC GTG CTG CGC AAG CAC 296 
Thr He Phe Glu Ser Arg Ala Val Ala Arg His Val Leu Arg Lys His 

AAA CCG GAG CTG CTG GGC TCC GGC TCG CCG GAG TCG GCG GCG ATG GTG 344 
Lys Pro Glu Leu Leu Gly Ser Gly Ser Pro Glu Ser Ala Ala Met Val 

GAC GTG TGG CTG GAG GTG GAG GCC CAC CAG CAC CAG ACC CCG GCG GGC 392 
Asp Val Trp Leu Glu Val Glu Ala His Gin His Gin Thr Pro Ala Gly 

ACC ATC GTC ATG CAG TGC ATC CTC ACC CCG TTC CTC GGC TGC CAG CGC 440 
Thr He Val Met Gin Cys He Leu Thr Pro Phe Leu Gly Cys Gin Arg 

GAC CAG GCC GCC ATC GAC GAG AAC GCG GCA AAG CTG ACG AAT CTG TTC 488 
Asp Gin Ala Ala He Asp Glu Asn Ala Ala Lys Leu Thr Asn Leu Phe 

GAC GTG TAC GAG GCG CGC CTG TCG GCG TCG AGG TAC CTT GCC GGG GAG 536 
Asp Val Tyr Glu Ala Arg Leu Ser Ala Ser Arg Tyr Leu Ala Gly Glu 

GCG GTC AGC CTC GCG GAC CTC AGC CAC TTC CCG TTC ATG CGA TAC TTC 584 
Ala Val Ser Leu Ala Asp Leu Ser His Phe Pro Phe Met Arg Tyr Phe 

ATG GAC ACC GAG TAC GCG TCG CTG GTG GAG GAG CGC CCG CAC GTG AAG 632 
Met Asp Thr Glu Tyr Ala Ser Leu Val Glu Glu Arg Pro His Val Lys 

GCG TGG TGG GAG GAG TTC AAG GCC AGC CCG GCG GCG AAG AGG GTG ACG 680 
Ala Trp Trp Glu Glu Phe Lys Ala Ser Pro Ala Ala Lys Arg Val Thr 

GAG nC ATG CCG CCA AAC TTC GGG TTC GGA AAG AAG GCA GAG AAG 725 
Glu Phe Met Pro Pro Asn Phe Gly Phe Gly Lys Lys Ala Glu Lys 

TGATGACAAG AACGAACACC GAGCGAACAT GTTGTGTGGT CTGTGCGACC CGACCATGGC 785 

TCAATGTTTT GGGCTGTTTG TGTTTCACGC ATGAAT6AAT AAAACAAAAT GCT7TTGGGT 845 
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nCAAAAAAA AAAAAAAAAA 865 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 224 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ala Ala Pro Ala Val Lys Val Tyr Gly Trp Ala Met Ser Pro Phe 
15 10 15 

Val Ala Arg Ala Leu Leu Cys Leu Glu Glu Ala Gly Val Glu Tyr Glu 
20 25 30 

Leu Val Pro Met Ser Arg Glu Ala Gly Asp His Arg Gin Pro Asp Phe 
35 40 45 

Leu Ala Arg Asn Pro Phe Gly Gin Val Pro Val Leu Glu Asp Gly Asp 
50 55 60 

Leu Thr He Phe Glu Ser Arg Ala Val Ala Arg His Val Leu Arg Lys 
65 70 75 80 

His Lys Pro Glu Leu Leu Gly Ser Gly Ser Pro Glu Ser Ala Ala Met 
85 90 95 

Val Asp Val Trp Leu Glu Val Glu Ala His Gin His Gin Thr Pro Ala 
100 105 110 

Gly Thr He Val Met Gin Cys He Leu Thr Pro Phe Leu Gly Cys Gin 
115 120 125 

Arg Asp Gin Ala Ala He Asp Glu Asn Ala Ala Lys Leu Thr Asn Leu 
130 135 140 

Phe Asp Val Tyr Glu Ala Arg Leu Ser Ala Ser Arg Tyr Leu Ala Gly 
145 150 155 160 

Glu Ala Val Ser Leu Ala Asp Leu Ser His Phe Pro Phe Met Arg Tyr 
165 170 175 
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Phe Met Asp Thr Glu Tyr Ala Sen Leu Val Glu Glu Arg Pro His Val 
180 185 190 

Lys Ala Trp Trp Glu Glu Phe Lys Ala Ser Pro Ala Ala Lys Arg Val 
195 200 205 

Thr Glu Phe Met Pro Pro Asn Phe Gly Phe Gly Lys Lys Ala Glu Lys 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 930 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 60.. 725 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION :1.. 930 

(D) OTHER INFORMATION :/note= •■WIC2 SEQUENCE AND ENCODED 
IC2 AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CACGCGTCCA TCTCCAAGAA GCGGAAGCTA GTGGAGCAGA GCAAACCAAG CAAGGTTG6 59 

AT6 GCG CCG GCG GTG AAG GTG TAC GG6 TGG GCC GTG TCG CCG TTC GTG 107 
Met Ala Pro Ala Val Lys Val Tyr Gly Trp Ala Val Ser Pro Phe Val 

GCG CGC CCA CT6 CTG TGC CTG GAG GAG GCC GGC GTC GAG TAC GAG CTC 155 
Ala Arg Pro Leu Leu Cys Leu Glu Glu Ala Gly Val Glu Tyr Glu Leu 

GTG TCC ATG AGC CGC GCG GCC GGC GAC CAC CGC CAG CCG GAC TTC CTC 203 
Val Ser Met Ser Arg Ala Ala Gly Asp His Arg Gin Pro Asp Phe Leu 

GCC C6G AAC CCC TTC GGC CAG GTC CCC GTC CTC GAG GAC GGC GAC CTC 251 
Ala Arg Asn Pro Phe Gly Gin Val Pro Val Leu Glu Asp Gly Asp Leu 



wo 99/14337 



8 



PCT/GB98/02802 



ACC 
Thr 


CTC 
Leu 


TTC 
Phe 


GAG 
Glu 


TCG 
Ser 


CGC 
Arg 


GCG 
Ala 


ATC 
He 


GCG 
Ala 


AGG 
Arg 


CAC GTG 
His Val 


CTC 
Leu 


CGG 
Arg 


AAG 

Lys 


CAC 
His 


299 


AAG 
Lys 


CCG 
Pro 


GAG 
Glu 


CTG 
Leu 


CTG 
Leu 


6GC 
Gly 


TGC 
Cys 


GGC 
Gly 


TCG 
Ser 


CCG 
Pro 


GAG GCG 
Glu Ala 


GAG 
Glu 


GCG 
Ala 


ATG 
Met 


GTG 
Val 


347 


GAC 
Asp 


GTG 
Val 


TGG 
Trp 


CTG 
Leu 


GAG 
Glu 


GTG 
Val 


GAG 
Glu 


GCC 
Ala 


CAC 
His 


CAG 
Gin 


TAC AAC 
Tyr Asn 


CCC 
Pro 


GCG 
Ala 


GCC 
Ala 


AGC 
Ser 


395 


6CC 
Ala 


ATC 
He 


GTG 
Val 


GTG 
Val 


CAG 
Gin 


TGC 
Cys 


ATC 
He 


ATC 
He 


HG 
Leu 


CCG 
Pro 


CTA CTG 
Leu Leu 


GGC 
Gly 


GGC 
Gly 


GCG 
Ala 


CGG 
Arg 


443 


GAC 
Asp 


CAG 
Gin 


GCG 
Ala 


GTG 
Val 


GTG 
Val 


GAC 
Asp 


GAG 
Glu 


AAC 
Asn 


GTA 
Val 


GCC 
Ala 


AAG CTC 
Lys Leu 


AAG 
Lys 


AAG 
Lys 


GTG 
Val 


CTG 
Leu 


491 


GAG 
Glu 


GTG 
Val 


TAC 
Tyr 


GAG 
Glu 


GCA 
Ala 


CGG 
Arg 


CTG 
Leu 


TCG 
Ser 


GCG 
Ala 


TCC 
Ser 


AGG TAC 
Arg Tyr 


CTC 
Leu 


GCC 
Ala 


GGG 
Gly 


GAC 
Asp 


539 


GAC 
Asp 


ATC 
He 


AGC 
Ser 


CTC 
Leu 


GCC 
Ala 


GAC 
Asp 


CTC 
Leu 


AGC 
Ser 


CAC 
His 


nc 

Phe 


CCC TTC 
Pro Phe 


ACG 
Thr 


CGC 
Arg 


TAC 
Tyr 


TTC 
Phe 


587 


ATG 
Met 


GAG 
Glu 


ACG 
Thr 


GAG 
Glu 


TAC 
Tyr 


GCG 
Ala 


CCG 
Pro 


CTG 
Leu 


GTG 
Val 


GCG 
Ala 


GAG CTC 
Glu Leu 


CCC 
Pro 


CAC 
His 


GTG 
Val 


AAC 
Asn 




GCG 
Ala 


TGG 
Trp 


TGG 
Trp 


GAG 
Glu 


GGG 
Gly 


CTC 
Leu 


AAG 
Lys 


GCC 
Ala 


AGG 
Arg 


CCG 
Pro 


GCC GCG 
Ala Ala 


AGG 
Arg 


AAG 

Lys 


GTG 
Val 


ACG 
Thr 


683 


GAG 
Glu 


CTC 
Leu 


ATG 
Met 


CCG 
Pro 


CCG 
Pro 


GAC 
Asp 


CTT 
Leu 


GGG 
Gly 


CTT 
Leu 


GGA 
Gly 


AAG AAA 
Lys Lys 


GCA 
Ala 


GAG 
Glu 






725 



TAGTGATGAC TGCCGCCAAC GTTCACCAGG ATCGAGCAA6 TCACTGTCGA 6TCTCCG6TT 785 



TTGCGTTGTA CGGCACCGGG GCACCGGCCT ATATTTTCTG TACCAGTGGC TCGTGTTITG 845 
ATGTTTTAGT CTCACGCTTG AATAAAATGC AAGATATACC CATCGGTTCT AAAAGAAAAA 905 
AAAAAAAAAA AAAAAAAAAA AAAAA 930 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Pro Ala Val Lys Val Tyr Gly Trp Ala Val Ser Pro Phe Val 
15 10 15 

Ala Arg Pro Leu Leu Cys Leu 61u Glu Ala Gly Val Glu Tyr Glu Leu 
20 25 30 

Val Ser Met Ser Arg Ala Ala Gly Asp His Arg Gin Pro Asp Phe Leu 
35 40 45 

Ala Arg Asn Pro Phe Gly Gin Val Pro Val Leu Glu Asp Gly Asp Leu 
50 55 60 

Thr Leu Phe Glu Ser Arg Ala He Ala Arg His Val Leu Arg Lys His 
65 70 75 80 

Lys Pro Glu Leu Leu Gly Cys Gly Ser Pro Glu Ala Glu Ala Met Val 
85 90 95 

Asp Val Trp Leu Glu Val Glu Ala His Gin Tyr Asn Pro Ala Ala Ser 
100 105 110 

Ala He Val Val Gin Cys He He Leu Pro Leu Leu Gly Gly Ala Arg 
115 120 125 

Asp Gin Ala Val Val Asp Glu Asn Val Ala Lys Leu Lys Lys Val Leu 
130 135 140 

Glu Val Tyr Glu Ala Arg Leu Ser Ala Ser Arg Tyr Leu Ala Gly Asp 
145 150 155 160 

Asp He Ser Leu Ala Asp Leu Ser His Phe Pro Phe Thr Arg Tyr Phe 
165 170 175 

Met Glu Thr Glu Tyr Ala Pro Leu Val Ala Glu Leu Pro His Val Asn 
180 185 190 

Ala Trp Trp Glu Gly Leu Lys Ala Arg Pro Ala Ala Arg Lys Val Thr 
195 200 205 

Glu Leu Met Pro Pro Asp Leu Gly Leu Gly Lys Lys Ala Glu 
210 215 220 

(2) INFORMATION FOR SEQ ID NO: 7: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 927 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 72.. 707 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION :1.. 927 

(D) OTHER INFORMATION :/note= "WIC 3/7/8 SEQUENCE AND 
ENCODED IC3 AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGCGGCTTTA CCTACCGAGA AGAAGAGAGA AAAAAGGHC GAGTGCGHC CAGAGTGAGG 60 



AGTGAGAAGA G ATG GCT CCG GTG AAG CTG TAC GGC GCG ACC CTG TCG TGG 110 
Met Ala Pro Val Lys Leu Tyr Gly Ala Thr Leu Ser Trp 



AAC 
Asn 


GTC 
Val 


ACC 
Thr 


AGG 
Arg 


TGC 
Cys 


GTG 
Val 


GCG 
Ala 


GCG 
Ala 


CTG 
Leu 


GAG 
Glu 


GAG 
Glu 


GCC 
Ala 


GGC 
Gly 


GTC 
Val 


CAG 
Gin 


TAC 
Tyr 


158 


GAG 
Glu 


ATC 
He 


6TA 
Val 


CCC 
Pro 


ATC 
He 


AAC 
Asn 


nc 

Phe 


GGC 
Gly 


ACC 
Thr 


GGC 
Gly 


GAG 
Glu 


CAC 
His 


AAG 
Lys 


AGC 
Ser 


CCC 
Pro 


GAC 
Asp 


206 


CAC 
His 


CTC 
Leu 


GCC 
Ala 


AGG 
Arg 


AAC 
Asn 


CCC 
Pro 


nc 

Phe 


GGC 
Gly 


CAG 
G1n 


GTG 
Val 


CCA 
Pro 


GCT 
Ala 


TTG 
Leu 


CAG 
Gin 


GAT 
Asp 


GGT 
Gly 


254 


GAC 
Asp 


HA 
Leu 


TAC 
Tyr 


GTC 
Val 


TTC 
Phe 


GAA 
Glu 


TCA 
Ser 


CGT 
Arg 


GCT 
Ala 


ATT 
He 


TGC 
Cys 


AAG 
Lys 


TAC 
Tyr 


GCG 
Ala 


TGC 
Cys 


CGC 
Arg 


302 


AAG 

Lys 


AAC 
Asn 


AAG 
Lys 


CCA 
Pro 


GAG 
Glu 


CTG 
Leu 


TTG 
Leu 


AAG 
Lys 


GAG 
Glu 


GGC 
Gly 


GAC 
Asp 


ATC 
He 


AAG 
Lys 


GAG 
Glu 


TCA 
Ser 


GCA 
Ala 


350 


ATG 
Met 


GTG 
Val 


GAT 
Asp 


GTG 
Val 


TGG 
Trp 


CTC 
Leu 


GAG 
Glu 


GTG 
Val 


GAG 
Glu 


GCC 
Ala 


CAT 
His 


CAG 
Gin 


TAC 
Tyr 


ACT 
Thr 


GCC 
Ala 


GCT 
Ala 


398 


CTG 
Leu 


AGC 
Ser 


CCC 
Pro 


ATT 
He 


CTC 
Leu 


nc 
Phe 


GAG 
Glu 


TGC 
Cys 


cn 

Leu 


ATC 
He 


CAT 
His 


CCA 
Pro 


ATG 
Met 


cn 

Leu 


GGG 
Gly 


GGA 
Gly 


446 



wo 99/14337 



11 



PCT/GB98/02802 



GCC ACT GAC CAG AAG GTC ATC GAC 6AC AAC CU GJJ AAG ATC AA6 AAC 494 
Ala Thr Asp Gin Lys Val He Asp Asp Asn Leu Val Lys He Lys Asn 

GTG CT6 GCG GTG TAC GAG GCG CAC CTG AGC AAG TCC AAG TAC CTG GCT 542 
Val Leu Ala Val Tyr Glu Ala His Leu Ser Lys Ser Lys Tyr Leu Ala 

GGA GAC TTC CTC AGT CTJ GCG GAC CTT AAC CAT GTG TCT GTC ACC CTG 590 
Gly Asp Phe Leu Ser Leu Ala Asp Leu Asn His Val Ser Val Thr Leu 

TGC TTG GCG GCT ACA CCC TAT GCG TCT CTG HC GAC GCG TAC CCG CAT 638 
Cys Leu Ala Ala Thr Pro Tyr Ala Ser Leu Phe Asp Ala Tyr Pro His 

GTG AAG GCC TGG TGG ACT GAC CTG CTG GCG AGG CCG TCC GTC CAG AAG 686 
Val Lys Ala Trp Trp Thr Asp Leu Leu Ala Arg Pro Ser Val Gin Lys 

GTC GCA GCG CTG ATG AAG CCA TGATCTTAAT TGCTGGTGCT CGHCGTCGC 737 
Val Ala Ala Leu Met Lys Pro 



GAAATAAGCC GAGGTGTGTG CCCCCCGATG TGTGCCTGTA CGAGTGTGT6 nCHGTGAT 797 
GTCTCCTC6T GnGAATGTT CAGGCTTGTG CHGCGATCC TGTCTCATCT TTTACTGAAA 857 
TGAGCGHCC TATGCTCTGG TTTAATAATA AAHGTGCCT AGATAHATC TCAAAAAAAA 917 
AAAAAAAAAA 927 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ala Pro Val Lys Leu Tyr Gly Ala Thr Leu Ser Trp Asn Val Thr 
15 10 15 

Arg Cys Val Ala Ala Leu Glu Glu Ala Gly Val Gin Tyr Glu He Val 
20 25 30 

Pro He Asn Phe Gly Thr Gly Glu His Lys Ser Pro Asp His Leu Ala 
35 40 45 
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Arg Asm Pro Phe Gly Gin Val Pro Ala Leu Gin Asp Gly Asp Leu Tyr 
50 55 60 

Val Phe 61 u Ser Arg Ala He Cys Lys Tyr Ala Cys Arg Lys Asn Lys 
65 70 75 80 

Pro Glu Leu Leu Lys Glu Gly Asp He Lys Glu Ser Ala Met Val Asp 
85 90 95 

Val Trp Leu Glu Val Glu Ala His Gin Tyr Thr Ala Ala Leu Ser Pro 
100 105 110 

He Leu Phe Glu Cys Leu He His Pro Met Leu Gly Gly Ala Thr Asp 
115 120 125 

Gin Lys Val He Asp Asp Asn Leu Val Lys He Lys Asn Val Leu Ala 
130 135 140 

Val Tyr Glu Ala His Leu Ser Lys Ser Lys Tyr Leu Ala Gly Asp Phe 
145 150 155 160 

Leu Ser Leu Ala Asp Leu Asn His Val Ser Val Thr Leu Cys Leu Ala 
165 170 175 

Ala Thr Pro Tyr Ala Ser Leu Phe Asp Ala Tyr Pro His Val Lys Ala 
180 185 190 

Trp Trp Thr Asp Leu Leu Ala Arg Pro Ser Val Gin l^s Val Ala Ala 
195 200 205 

Leu Met Lys Pro 
210 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 866 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
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(B) LOCATION: 45.. 683 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION:!. .866 

(D) OTHER INFORMATION: /note= "WIC5 SEQUENCE AND ENCODED 
IC5 AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GAAGCAGGCA ACAGGCGAGC AGGAAGGAAG CAAGAGAGGT GGAG ATG GCG CCC ATC 56 

Met Ala Pro He 



AAG 

Lys 


CTG 
Leu 


TAC 
Tyr 


GGG 
Gly 


ATG 
Met 


ATG 
Met 


CTG 
Leu 


TCG 
Ser 


GCC 
Ala 


AAC 
Asn 


GTG ACC 
Val Thr 


CGC 
Arg 


GTG 
Val 


ACC 
Thr 


ACG 
Thr 


104 


CTG 
Leu 


CTC 
Leu 


AAC 
Asn 


GAG 
Glu 


CTC 
Leu 


GGC 
Gly 


CTC 
Leu 


GAG 
Glu 


TTC 
Phe 


GAC 
Asp 


TTC GTC 
Phe Val 


GAC 
Asp 


GTC 
Val 


GAC 
Asp 


CTC 
Leu 


152 


CGC 
Arg 


ACC 
Thr 


GGC 
Gly 


GCC 
Ala 


CAC 
His 


AAG 

Lys 


CAC 
His 


CCC 
Pro 


GAC 
Asp 


TTC 
Phe 


CTC AAG 
Leu Lys 


CTC 
Leu 


AAC 
Asn 


CCT 
Pro 


nc 

Phe 


200 


GGC 
Gly 


CAG 
Gin 


ATC 
He 


CCC 
Pro 


GCG 
Ala 


CTG 
Leu 


CAG 
Gin 


GAC 
Asp 


GGA 
Gly 


GAC 
Asp 


6AA GTT 
Glu Val 


GTC 
Val 


nc 
Phe 


GAG 
Glu 


TCG 
Ser 


248 


CGC 
Arg 


GCC 
Ala 


ATC 
He 


AAC 
Asn 


CGG 
Arg 


TAC 
Tyr 


ATC 
He 


GCG 
Ala 


ACC 
Thr 


AAG 
Lys 


TAC GGG 
Tyr Gly 


GCG 
Ala 


TCC 
Ser 


CTG 
Leu 


CTG 
Leu 


296 


CCG 
Pro 


ACG 
Thr 


CCG 
Pro 


TCG 
Ser 


GCC 
Ala 


AAG 
Lys 


CTG 
Leu 


GAG 
Glu 


GCG 
Ala 


TGG 
Trp 


CTG GAG 
Leu Glu 


GTG 
Val 


GAG 
Glu 


TCG 
Ser 


CAC 
His 


344 


CAC 
His 


TTC 
Phe 


TAC 
Tyr 


CCG 
Pro 


CCG 
Pro 


GCG 
Ala 


CGG 
Arg 


ACG 
Thr 


CTG 
Leu 


GTG 
Val 


TAC GAG 
Tyr Glu 


CTG 
Leu 


GTC 
Val 


ATC 
He 


AAG 
Lys 


392 


CCC 
Pro 


ATG 
Met 


CTG 
Leu 


GGC 
Gly 


GCC 
Ala 


CCC 
Pro 


ACC 
Thr 


GAC 
Asp 


GCC 
Ala 


GCC 
Ala 


GAG GTG 
Glu Val 


GAC 
Asp 


AAG 
Lys 


AAC 
Asn 


GCC 
Ala 


440 


GCC 
Ala 


GAC 
Asp 


CTC 
Leu 


GCC 
Ala 


AAG 
Lys 


CTG 
Leu 


CTC 
Leu 


GAC 
Asp 


GTC 
Val 


TAC 
Tyr 


GAG GCC 
Glu Ala 


CAC 
His 


CTC 
Leu 


GCC 
Ala 


GCC 
Ala 


488 


GGG 
Gly 


AAC 
Asn 


AAG 
Lys 


TAC 
Tyr 


CTG 
Leu 


GCC 
Ala 


GGC 
Gly 


GAC 
Asp 


GCC 
Ala 


TTC 
Phe 


CCG CTC 
Pro Leu 


GCC 
Ala 


GAC 
Asp 


GCC 
Ala 


AAC 
Asn 


536 


CAC 
His 


ATG 
Met 


TCC 
Ser 


TAC 
Tyr 


CTC 
Leu 


TTC 
Phe 


ATG 
Met 


CTC 
Leu 


ACC 
Thr 


AAG 
Lys 


AGC CCC 
Ser Pro 


AAG 
Lys 


GCG 
Ala 


GAC 
Asp 


CTG 
Leu 


584 
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GT6 6CC TCC CGC CCG CAC GTC AA6 GCC TGG TGG GAG GAG ATC TCC GCC 632 
Val Ala Ser Arg Pro His Val Lys Ala Trp Trp Glu Glu He Ser Ala 

CGC CCC GCC TGG GCC AAG ACC GTC GCC TCC ATC CCC CTC CCG CCC GCC 680 
Arg Pro Ala Trp Ala Lys Thr Val Ala Ser lie Pro Leu Pro Pro Ala 

GTC TGAGGHGCT TGTTTGGCTG C66CGAGAAC GGAATAAAAT CGCGATGATG 733 
Val 



GAATAAACAA CTTHTAGAG A6GAAGCTTG GAAHCTTGG TGHGCTGCT GHGAATGH 793 
GAATCTTGGT GHGAATGH TACGGCACAT CTAATTTATC CAGI 1 1 1 1 1 1 GGCGTGAAAA 853 
AAAAAAAAAA AAA 866 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Pro He Lys Leu Tyr Gly Met Met Leu Ser Ala Asn Val Thr 
15 10 15 

Arg Val Thr Thr Leu Leu Asn Glu Leu Gly Leu Glu Phe Asp Phe Val 
20 25 30 

Asp Val Asp Leu Arg Thr Gly Ala His Lys His Pro Asp Phe Leu Lys 
35 40 45 

Leu Asn Pro Phe Gly Gin He Pro Ala Leu Gin Asp Gly Asp Glu Val 
50 55 60 

Val Phe Glu Ser Arg Ala He Asn Arg Tyr He Ala Thr Lys Tyr Gly 
65 70 75 80 

Ala Ser Leu Leu Pro Thr Pro Ser Ala Lys Leu Glu Ala Trp Leu Glu 
85 90 95 

Val Glu Ser His His Phe Tyr Pro Pro Ala Arg Thr Leu Val Tyr Glu 
100 105 110 



# 
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Leu Val He Lys Pro Met Leu Gly Ala Pro Thr Asp Ala Ala Glu Val 
115 120 125 

Asp Lys Asn A1a Ala Asp Leu Ala Lys Leu Leu Asp Val Tyr Glu Ala 
130 135 140 

His Leu Ala Ala Gly Asn Lys Tyr Leu Ala Gly Asp Ala Phe Pro Leu 
145 150 155 160 

Ala Asp Ala Asn His Met Ser Tyr Leu Phe Met Leu Thr Lys Ser Pro 
165 170 175 

Lys Ala Asp Leu Val Ala Ser Arg Pro His Val Lys Ala Trp Trp Glu 
180 185 190 

Glu lie Ser Ala Arg Pro Ala Trp Ala Lys Thr Val Ala Ser He Pro 
195 200 205 

Leu Pro Pro Ala Val 
210 

(2) INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 897 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 15.. 668 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION:!.. 897 

(D) OTHER INFORMATION :/note= "WIC4 SEQUENCE AND ENCODED 
IC4 AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AACCAAGG6A AACA ATG GC6 CCG GTG AAG GTG UC GGG CCG GCG ATG TCG 50 
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Met 


Ala 


Pro 


Val Lys 


Val 


Phe 


Gly 


Pro 


Ala 


Met Ser 




ACC 
Thr 

i M 1 


AAC 
Asn 


GTG 
Val 


GCC 
Ala 


CGG 
Ara 


GTG 
Val 


CTG 
Leu 


GTG TGC 
Val Cvs 


CTG 
Leu 


GAG 
Glu 


GAG 
Glu 


GTC 
Val 


GGC 
Glv 


GCC GAG 
Ala Glu 


98 


TAC 
Tvr 


GAG 
61 u 


GTG 
Val 


GTC 
Val 


GAC 
Asp 


ATC 
He 


GAT 
Asp 


nc AAG 
Phe Lys 


GCC 
Ala 


ATG 
Met 


GAG 
Glu 


CAC 
His 


AAG 
Lvs 

UJO 


AGC CCC 
Ser Pro 


146 


GAG 
Glu 


CAT 
His 


CTC 
Leu 


GTC 
Val 


AGA 
Arq 


AAC 
Asn 


CCG 
Pro 


TTC GGC 
Phe Gly 


CAA 
Gin 


ATC 
He 


CCT 
Pro 


GCC 
Ala 


TTC 
Phe 


CAG GAT 
Gin AsD 


194 


GGG 
Glv 


GAT 
Asp 


CTG 
Lgu 


CTT 
Leu 


CTC 
Leu 


TTC 
Phe 


GAG 
Glu 


TCA CGC 
Ser Arg 


GCA 
Ala 


ATT 
He 


GCG 
Ala 


AGG 
Aro 


TAC 

Tvr 
1 J 1 


GTG CTC 
Val Leu 


242 


CGC 
Apq 


AAG 

L v<5 

ujro 


TAC 

Tvr 


AAG 

Lvs 

I- JO 


AAG 

Lvs 

LJO 


AAC 

Asn 


GAA 
Glu 


GTG GAC 
Val AsD 


CTG 
Leu 


CTG 

Leu 


AGG 
Ara 


GAA 
Glu 


GGC 
Glv 

U 1 J 


GAC CTC 

Asn 1 PU 

OO^ 


290 


AAG 

Lvs 


GAG 
Glu 


GCG 
Ala 


GCG 
Ala 


ATG 
Met 


GTG 
Val 


GAC 
Asp 


GTA TGG 
Val Trp 


ACG 
Thr 


GAG 
Glu 


GTG 
Val 


GAC 

ASD 


GCG 
Ala 


CAC ACC 
His Thr 

1 1 1 O 1 1 II 


338 


TAC 
Tvr 


AAC 
Asn 


CCG 
Pro 


GCC 
Ala 


ATC 
He 


TCG 
Ser 


CCG 
Pro 


ATC GTG 
He Val 


TAC 

Tvr 
« J 1 


GAG 
Glu 


TGC 
Cvs 

v/J *J 


TCA 
Ser 


TCA 
Ser 


ACC GCT 
Thr Ala 

1 1 1 1 O 1 u 


386 


CAT 
His 


GCG 
Ala 


CGG 
Arg 


CTG 

1 PU 


CCG 
Pro 


ACC 
Thr 


AAC 
Asn 


CAA ACG 
Gin Thr 

1 1 1 1 1 1 1 


GTG 
Val 


GTG 
Val 

V U 1 


GAC 

/vOji./ 


GAG 
Glu 


AGC 
Spp 


CTG GAG 
1 PU Rlu 


434 


AAG 

ujro 


CTC 

L pu 


AAG 
Lvs 

I- JO 


AAC 
Asn 


GTG 
Val 


CTG 

1 PU 


GAG 
Glu 


GTC TAC 
Val Tvr 

V U 1 I J 1 


GAG 
Glu 


GCG 
Ala 


CGC 

Arn 


CTG 

1 PU 


TCC 

^PP 


AAG CAC 

L. J o n 1 o 


482 


GAC 
Asp 


TAC 
Tvr 


CTC 
Leu 


GCC 
Ala 


GGG 
Glv 

w 1 J 


GAC 
Asp 


nc 

Phe 


GTC AGC 
Val Ser 


nc 

Phe 


GCG 
Ala 


GAC 
Asp 


CTC 

Lpu 


AAC 
Asn 


CAC TTC 
His Phe 

1 1 1 o r i 1^ 


530 


CCC 
Pro 


TAG 
Tyr 


ACC 
Thr 


nc 

Phe 


TAC 
Tyr 


TTC 
Phe 


ATG 
Met 


GCC ACG 
Ala Thr 


CCG 
Pro 


CAC 
His 


GCG 
Ala 


GCC 
Ala 


CTC 
Leu 


TTC GAC 
Phe Asp 


578 


TCG 
Ser 


TAC 
Tyr 


CCG 
Pro 


CAC 
His 


GTC 
Val 


AAG 
Lys 


GCC 
Ala 


TGG TGG 
Trp Trp 


GAG 
Glu 


AGG 
Arg 


ATC 
He 


ATG 
Met 


GCG 
Ala 


AGG CCG 
Arg Pro 


626 


GCC 
A1a 


GTG 
Val 


AAG 
Lys 


AAG 
Lys 


CTC 
Leu 


GCC 
Ala 


GCG 
Ala 


CAG ATG 
Gin Met 


GH 
Val 


CCC 
Pro 


AAG 
Lys 


AAG 
Lys 


CCG 
Pro 




668 



TGATTTGCTA GGCGGGATCT CGCATCGT6G GATCCGATTC CGATCACTGA TCTGTGTGGC 728 



GITTTCTTTT CTTGTT6GT6 TCGCGAATAA GGCAAATGAG CTCGTGTGTG TGTGGCTGGA 788 



ATTGCACCAG CGTGCAGTTT TTGCGCTTTG CGTGTGTGTG GTCGTGAAAA CTCHGAGAT 848 
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GGAACAATGT CTTCGTAATG CTTTCACATT HAAAAAAAA AAAAAAAAA 897 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Pro Val Lys Val Phe Gly Pro Ala Met Ser Thr Asn Val Ala 
15 10 15 

Arg Val Leu Val Cys Leu Glu Glu Val Gly Ala Glu Tyr Glu Val Val 
20 25 30 

Asp He Asp Phe Lys Ala Met Glu His Lys Ser Pro Glu His Leu Val 
35 40 45 

Arg Asn Pro Phe Gly Gin He Pro Ala Phe Gin Asp Gly Asp Leu Leu 
50 55 60 

Leu Phe Glu Ser Arg Ala He Ala Arg Tyr Val Leu Arg Lys Tyr Lys 
65 70 75 80 

Lys Asn Glu Val Asp Leu Leu Arg Glu Gly Asp Leu Lys Glu Ala Ala 
85 90 95 

Met Val Asp Val Trp Thr Glu Val Asp Ala His Thr Tyr Asn Pro Ala 
100 105 110 

He Ser Pro He Val Tyr Glu Cys Ser Ser Thr Ala His Ala Arg Leu 
115 120 125 

Pro Thr Asn Gin Thr Val Val Asp Glu Ser Leu Glu Lys Leu Lys Asn 
130 135 140 

Val Leu Glu Val Tyr Glu Ala Arg Leu Ser Lys His Asp Tyr Leu Ala 
145 150 155 160 

Gly Asp Phe Val Ser Phe Ala Asp Leu Asn His Phe Pro Tyr Thr Phe 
165 170 175 

Tyr Phe Met Ala Thr Pro His Ala Ala Leu Phe Asp Ser Tyr Pro His 



wo 99/14337 



18 



PCr/GB98/02802 



180 185 190 

Val iys Ala Trp Trp Glu Arg He Met A1a Arg Pro Ala Val Lys Lys 
195 200 205 

Leu Ala Ala Gin Met Val Pro Lys Lys Pro 
210 215 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 721 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 21.. 686 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION :1.. 721 

(D) OTHER INFORMATION :/note= "TA 27 SEQUENCE AND ENCODED 
AMINO ACID SEQUENCE" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TTC6GCACGA G6AAGAAGGG ATG GAG CCT AT6 AAG 6TG TAC GGC TGG GCG 

Met Glu Pro Met Lys Val Tyr Gly Trp Ala 

GT6 TCG CCA TGG ATG GCG CGG GTC CTC GTC TCC CTG GAG GAG GCC GGC 
Val Ser Pro Trp Met Ala Arg Val Leu Val Ser Leu Glu Glu Ala Gly 

GCC GAC TAC GAG CTC GTG CCC ATG AGC CGC AAC GGC GGC GAC CAC CGG 
Ala Asp Tyr Glu Leu Val Pro Met Ser Arg Asn Gly Gly Asp His Arg 

CGG CCG GAG CAC CTC GCC AGA AAC CCC HC GGT GAG ATC CCG GTG CTC 
Arg Pro Glu His Leu Ala Arg Asn Pro Phe Gly Glu He Pro Val Leu 

GAA TAC GGC GGT CTG ACG CTT TAC CAA TCC CGC GCC ATT GCA AGG CAT 
Glu Tyr Gly Gly Leu Thr Leu Tyr Gin Ser Arg Ala He Ala Arg His 



50 
98 
146 
194 
242 
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ATT CTC C6C AAA CAC AAG CCC GGG CTT CTA GGA GCA GGC AGC CTC GAG 290 
He Leu Arg Lys His Lys Pro Gly Leu Leu Gly Ala Gly Ser Leu Glu 

GAG TC6 GCG ATG GTG GAT GTA TGG GTC GAC GTG GAT GCC CAC CAC CTG 338 
Glu Ser Ala Met Val Asp Val Trp Val Asp Val Asp Ala His His Leu 

GAG CCC GTA CTC AAG CCC ATC GTG TGG AAC TGC ATC ATC AAC CCG TTC 386 
Glu Pro Val Leu Lys Pro He Val Trp Asn Cys He He Asn Pro Phe 

GTC GGG AGG GAC GTC GAC CAG GGC CTC GTC GAT GAG AGC GTC GAG AAG 434 
Val Gly Arg Asp Val Asp Gin Gly Leu Val Asp Glu Ser Val Glu Lys 

CTC AAG AAG CTG CTG GAG GTG TAC GAG GCA AGA CTG TCA AGC AAC AAG 482 
Leu Lys Lys Leu Leu Glu Val Tyr Glu Ala Arg Leu Ser Ser Asn Lys 

TAC TTG GCC GGG GAT TIC GTC AGC TTC GCC GAC CTC ACC CAT TTC TCC 530 
Tyr Leu Ala Gly Asp Phe Val Ser Phe Ala Asp Leu Thr His Phe Ser 

TTC ATG CGC TAC TTC ATG GCG ACG GAG CAT GCG GH GTG CTC GAT GCG 578 
Phe Met Arg Tyr Phe Met Ala Thr Glu His Ala Val Val Leu Asp Ala 

TAT CCG CAT GTG AAG GCA TGG TGG AAG GCG CTG CTG GCA AGG CCA TC6 626 
Tyr Pro His Val Lys Ala Trp Trp Lys Ala Leu Leu Ala Arg Pro Ser 

GTC AAG AAG GTG ATA GCT GGC ATG CCT CCG GAT TTT GGA TTC GGG AGC 674 
Val Lys Lys Val He Ala Gly Met Pro Pro Asp Phe Gly Phe Gly Ser 

GGG AGA ATA CCA TGATAAAGCA TGCTTGTTTG TCTATGATGC TCTGA 721 
Gly Arg He Pro 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Glu Pro Met Lys Val Tyr Gly Trp Ala Val Ser Pro Trp Met Ala 
15 10 15 

Arg Val Leu Val Ser Leu Glu Glu Ala Gly Ala Asp Tyr Glu Leu Val 
20 25 30 



# 
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Pro Met Ser Arg Asn Gly Gly Asp His Arg Arg Pro Glu His Leu Ala 
35 40 45 

Arg Asn Pro Phe Gly Glu He Pro Val Leu Glu Tyr Gly Gly Leu Thr 
50 55 60 

Leu Tyr Gin Ser Arg Ala He Ala Arg His He Leu Arg Lys His Lys 
65 70 75 80 

Pro Gly Leu Leu Gly Ala Gly Ser Leu Glu Glu Ser Ala Met Val Asp 
85 90 95 

Val Trp Val Asp Val Asp Ala His His Leu Glu Pro Val Leu Lys Pro 
100 105 110 

He Val Trp Asn Cys He He Asn Pro Phe Val Gly Arg Asp Val Asp 
115 120 125 

Gin Gly Leu Val Asp Glu Ser Val Glu Lys Leu Lys Lys Leu Leu Glu 
130 135 140 

Val Tyr Glu Ala Arg Leu Ser Ser Asn Lys Tyr Leu Ala Gly Asp Phe 
145 150 155 160 

Val Ser Phe Ala Asp Leu Thr His Phe Ser Phe Met Arg Tyr Phe Met 
165 170 175 

Ala Thr Glu His Ala Val Val Leu Asp Ala Tyr Pro His Val Lys Ala 
180 185 190 

Trp Trp Lys Ala Leu Leu Ala Arg Pro Ser Val Lys Lys Val He Ala 
195 200 205 

Gly Met Pro Pro Asp Phe Gly Phe Gly Ser Gly Arg He Pro 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 926 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



wo 99/14337 



21 



PCT/GB98/02802 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 66.. 764 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

AACCACTTTC ATCAACGTCT CCTACGCTCA CCGTTC6TTG CTCCGCACAT CAGCAGGACT 60 

T6CCA ATG GCG GGA GAC GGC GAG CTG AAG CTG CTG GGC GTG TGG ACG 107 
Met Ala Gly Asp Gly Glu Leu Lys Leu Leu Gly Val Trp Thr 
1 5 10 

AGC CC6 nC GTC ATG AGG GTG CGC GTG GTG CTC AAC CTC AAG TCG CTG 155 
Ser Pro Phe Val He Arg Val Arg Val Val Leu Asn Leu Lys Ser Leu 
15 20 25 30 

CCG TAC GAG TAC GTG GAG GAG AGC CTG GGC AGC AAG AGC GCG CTC CTC 203 
Pro Tyr Glu Tyr Val Glu Glu Ser Leu Gly Ser Lys Ser Ala Leu Leu 
35 40 45 

CTG GGC TCC AAC CCG GTG CAC CA6 AGC GTG CCC GTC CTC CTC CAC GGC 251 
Leu Gly Ser Asn Pro Val His Gin Ser Val Pro Val Leu Leu His Gly 
50 55 60 

GGC CGC CCC GTG AAC GAG TCC CAG GTC ATC GTG CAG TAC ATC GAC GAG 299 
Gly Arg Pro Val Asn Glu Ser Gin Val He Val Gin Tyr He Asp Glu 
65 70 75 

GTC TGG GCG GGG GCC GGC CCG TCC GTG CTC CCG GCC GAC CCC TAC GAG 347 
Val Trp Ala Gly Ala Gly Pro Ser Val Leu Pro Ala Asp Pro Tyr Glu 
80 85 90 

CGC GCC ACG GCG CGC TTC TGG GCG GCG TAC GTC GAC GAC AAG GTC GGG 395 
Arg Ala Thr Ala Arg Phe Trp Ala Ala Tyr Val Asp Asp Lys Val Gly 
95 100 105 110 

TCG GCG TGG ACG GGG ATG CTC HC TCG TGC AAG ACG GAG GAG GAG C6G 443 
Ser Ala Trp Thr Gly Met Leu Phe Ser Cys Lys Thr Glu Glu Glu Arg 
115 120 125 

GCG GAG GCG GTG TCC CGG GCC GTG GCG GCG CTG GAG ACC CTG GAG GGC 491 
Ala Glu Ala Val Ser Arg Ala Val Ala Ala Leu Glu Thr Leu Glu Gly 
130 135 140 
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GCG TTC GCG GAG TGC TCC MG GGG AAG GCG HC TTC GGC GGC GAC GCC 539 
Ala Phe Ala Glu Cys Ser Lys Gly Lys Ala Phe Phe Gly Gly Asp Ala 
145 150 155 

ATC GGG TTC GTC GAC GTC GTG CTT GGC GGC TAC CTC GGC TGG UC GGC 587 
He Gly Phe Val Asp Val Val Leu Gly Gly Tyr Leu Gly Trp Phe Gly 
160 165 170 

GCG ATC GAC AAG ATC ATC GGG CGC CGG CTG ATC GAC CCG GCG AGG ACG 635 
Ala He Asp Lys He He Gly Arg Arg Leu He Asp Pro Ala Arg Thr 
175 180 185 190 

CCG CTG CTG GCC AGG TGG GAG GAG CGG TTC CGC GCG GCG GAC GCG GCC 683 
Pro Leu Leu Ala Arg Trp Glu Glu Arg Phe Arg Ala Ala Asp Ala Ala 
195 200 205 

AAG GGC GTC GTG CCG GAC GAC GCC GAC AAG AT6 CTC GAG HC HG CCC 731 
Lys Gly Val Val Pro Asp Asp Ala Asp Lys Met Leu Glu Phe Leu Pro 
210 215 220 

ACC GTG CTC GCT TGG ATC GCC GGC AAA GCG AAG TGAACTGTGT CTGTGAGGCC 784 
Thr Val Leu Ala Trp He Ala Gly Lys Ala Lys 
225 230 

GT6ACATCGC CAGCTCGTGA CATGTGTGH TGTGTGT6TC TGAGTCCGTC CAGTGTGTGC 844 

TGAATAAATG CACC6CAT6T C6TGTGTTGT ACCAAGGGCA AACAATGCTG AATAATnTG 904 

CTGHAAAAA AAAAAAAAAA AA 926 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ala Gly Asp Gly Glu Leu Lys Leu Leu Gly Val Trp Thr Ser Pro 
15 10 15 



Phe Val He Arg Val Arg Val Val Leu Asn Leu Lys Ser Leu Pro Tyr 
20 25 30 
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61 u Tyr Val Glu Glu Ser Leu Gly Ser Lys Sen Ala Leu Leu Leu Gly 
35 40 45 

Ser Asn Pro Val His Gin Ser Val Pro Val Leu Leu His Gly Gly Arg 
50 55 60 

Pro Val Asn Glu Ser Gin Val He Val Gin Tyr He Asp Glu Val Trp 
65 70 75 80 

Ala Gly Ala Gly Pro Ser Val Leu Pro Ala Asp Pro Tyr Glu Arg Ala 
85 90 95 

Thr Ala Arg Phe Trp Ala Ala Tyr Val Asp Asp Lys Val Gly Ser Ala 
100 105 110 

Trp Thr Gly Met Leu Phe Ser Cys Lys Thr Glu Glu Glu Arg Ala Glu 
115 120 125 

Ala Val Ser Arg Ala Val Ala Ala Leu Glu Thr Leu Glu Gly Ala Phe 
130 135 140 

Ala Glu Cys Ser Lys Gly Lys Ala Phe Phe Gly Gly Asp Ala He Gly 
145 150 155 160 

Phe Val Asp Val Val Leu Gly Gly Tyr Leu Gly Trp Phe Gly Ala He 
165 170 175 

Asp Lys He He Gly Arg Arg Leu He Asp Pro Ala Arg Thr Pro Leu 
180 185 190 

Leu Ala Arg Trp Glu Glu Arg Phe Arg Ala Ala Asp Ala Ala Lys Gly 
195 200 205 

Val Val Pro Asp Asp Ala Asp Lys Met Leu Glu Phe Leu Pro Thr Val 
210 215 220 

Leu Ala Trp He Ala Gly Lys Ala Lys 
225 230 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1043 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(1x) FEATURE: 

(A) NAME/KEY: COS 

(B) LOCATION: 39.. 767 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



AGGACACGAG TATCAGGGAG GAAGACGA6G AAACGHG ATG GCC GGC G6T GAA 53 

Met Ala 61y Gly Glu 
235 

GAG CTG AAG CTG CTG GGG TGG TGG GCG CCC G6G GTG AGT CCC TAC GTG 101 
Glu Leu Lys Leu Leu Gly Trp Trp Ala Pro Gly Val Ser Pro Tyr Val 
240 245 250 

CTG CGC GCC CAG ATG GCG CTC GCC GTA AAG GGG CTG AGC TAC GAC TAC 149 
Leu Arg Ala Gin Met Ala Leu Ala Va1 Lys Gly Leu Ser Tyr Asp Tyr 
255 260 265 270 

CTC CCC GAG GAC CGC TGG TCC ACG AGC GAC CTC CTC ATC GCG TCC AAC 197 
Leu Pro Glu Asp Arg Trp Ser Thr Ser Asp Leu Leu He Ala Ser Asn 
275 280 285 

CCC GTG TAC AAG AAG GTG CCC GTC CTC ATT CAC AAC GGC AGG CCC GTC 245 
Pro Val Tyr Lys Lys Val Pro Val Leu He His Asn Gly Arg Pro Val 
290 295 300 

TGC GAG TCG CTG CTC ATC CTG GAG TAC CTC GAC GAC GCC GTC GGC CTJ 293 
Cys Glu Ser Leu Leu He Leu Glu Tyr Leu Asp Asp Ala Val Gly Leu 
305 310 315 

GCC GGC AAC GGC AAG CCC ATC CTC CCC GCA GAC CCC TAC AGC CGC GCC 341 
Ala Gly Asn Gly Lys Pro He Leu Pro Ala Asp Pro Tyr Ser Arg Ala 
320 325 330 

GTC GCT CGC TTC TGG GCC GCC TAT GTG AAC GAC AAG CTG HC CCT TCG 389 
Val Ala Arg Phe Trp Ala Ala Tyr Val Asn Asp Lys Leu Phe Pro Ser 
335 340 345 350 

TGC ACC GGG ATC CTC AAG ACT ACG AAG CAG GAG GAG AGA GCC GGT AAG 437 
Cys Thr Gly He Leu Lys Thr Thr Lys Gin Glu Glu Arg Ala Gly Lys 
355 360 365 



ATG GAG GAG ACC CTG TCC GGG CTC AGA CAC HA GAA GCT GTC ATG GCG 



485 



# 
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Met Glu Glu Thr Leu Ser Gly Leu Arg His Leu Glu Ala Val Met Ala 
370 375 380 

GAG TGC TCC GAA GGG GAG GCG GAG GCG CCG TJC TTC GGT GGT GAC GCC 533 
Glu Cys Ser Glu Gly Glu Ala Glu Ala Pro Phe Phe Gly Gly Asp Ala 
385 390 395 

ATC GGG nC CTC GAC ATC GCG CTC GGG TGC TAT CH CCC TGG TTT GAG 581 
He Gly Phe Leu Asp He Ala Leu Gly Cys Tyr Leu Pro Trp Phe Glu 
400 405 410 

GCA GCA GGC CGC CTG GCC GGC HG GGG CCG ATC ATC GAC CCG GCG AGG 629 
Ala Ala Gly Arg Leu Ala Gly Leu Gly Pro He He Asp Pro Ala Arg 
415 420 425 430 

ACG CCG AAA CTA GCT GCG TGG GCG GAG CGG TTC AGC GTC GCC GAG CCG 677 
Thr Pro Lys Leu Ala Ala Trp Ala Glu Arg Phe Ser Val Ala Glu Pro 
435 440 445 

ATC AAG GCG CTG CTG CCT GGG GTC GAC AAG CTG GAG GAG TAC ATC ACT 725 
He Lys Ala Leu Leu Pro Gly Val Asp Lys Leu Glu Glu Tyr He Thr 
450 455 460 

ACG GCG CTT TAT CCA AAG TGG AAC ATC GCG GTC ACC GGC AAC 767 
Thr Ala Leu Tyr Pro Lys Trp Asn He Ala Val Thr Gly Asn 
465 470 475 

TAATTAAAGA TCnGTCGTT CCACTATGGC AAAAGAAATA AAAAAGGGCG TC6TTCGATA 827 

ACCGGCGGAG GATCTCTGCC TTGTGAGTAG CTGTnTCAC GTCAAGAGTT GAACTGTTAC 887 

TACTAAGTCG GGTTTCTTTT TGCGAGGGTT AGTGGGTCGT GGTCATGAAT AATGCACAGG 947 

CGTGCACTCT CTTCGATCTG A6TT6TGATA TGTTGTTrCG TGAATAAATT GAAGCGTCGT1007 

CGATCHGCA TCTAAAAAAA AAAAAAAAAA AAAAAA 1043 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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Met Ala Gly Gly Glu Glu Leu Lys Leu Leu Gly Trp Trp Ala Pro 61y 
15 10 15 

Val Ser Pro Tyr Val Leu Arg Ala Gin Met Ala Leu Ala Val Lys Gly 
20 25 30 

Leu Ser Tyr Asp Tyr Leu Pro Glu Asp Arg Trp Ser Thr Ser Asp Leu 
35 40 45 

Leu He Ala Ser Asn Pro Val Tyr Lys Lys Val Pro Val Leu He His 
50 55 60 

Asn Gly Arg Pro Val Cys Glu Ser Leu Leu He Leu Glu Tyr Leu Asp 
65 70 75 80 

Asp Ala Val Gly Leu Ala Gly Asn Gly Lys Pro He Leu Pro Ala Asp 
85 90 95 

Pro Tyr Ser Arg Ala Val Ala Arg Phe Trp Ala Ala Tyr Val Asn Asp 
100 105 no 

Lys Leu Phe Pro Ser Cys Thr Gly He Leu Lys Thr Thr Lys Gin Glu 
115 120 125 

Glu Arg Ala Gly Lys Met Glu Glu Thr Leu Ser Gly Leu Arg His Leu 
130 135 140 



Glu Ala Val Met Ala Glu Cys Ser Glu Gly Glu Ala Glu Ala Pro Phe 
145 150 155 160 

Phe Gly Gly Asp Ala He Gly Phe Leu Asp He Ala Leu Gly Cys Tyr 
165 170 175 

Leu Pro Trp Phe Glu Ala Ala Gly Arg Leu Ala Gly Leu Gly Pro He 
180 185 190 

He Asp Pro Ala Arg Thr Pro Lys Leu Ala Ala Trp Ala Glu Arg Phe 
195 200 205 

Ser Val Ala Glu Pro He Lys Ala Leu Leu Pro Gly Val Asp Lys Leu 
210 215 220 

Glu Glu Tyr He Thr Thr Ala Leu Tyr Pro Lys Trp Asn He Ala Val 
225 230 235 240 

Thr Gly Asn 



